Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CCOL-2441: Allow custom publishing batch size per producer #216

Merged
merged 10 commits into from
May 13, 2024
Merged

Conversation

ariana-flipp
Copy link
Collaborator

Pull Request Template

Description

Enabled the ability to configure max_batch_size on a per-producer level instead of hard-coded limit of 500. The global setting is still set to 500 by default.

Fixes CCOL-2441

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

Added specs

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added a line in the CHANGELOG describing this change, under the UNRELEASED heading
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

Copy link
Member

@dorner dorner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good!

@@ -197,6 +197,7 @@ producers.schema_namespace|nil|Default namespace for all producers. Can remain n
producers.topic_prefix|nil|Add a prefix to all topic names. This can be useful if you're using the same Kafka broker for different environments that are producing the same topics.
producers.disabled|false|Disable all actual message producing. Generally more useful to use the `disable_producers` method instead.
producers.backend|`:kafka_async`|Currently can be set to `:db`, `:kafka`, or `:kafka_async`. If using Kafka directly, a good pattern is to set to async in your user-facing app, and sync in your consumers or delayed workers.
producers.max_batch_size|500|Maximum batch size for publishing. Individual producers can override.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing the doc for the new per-producer setting?

@@ -92,6 +90,12 @@ def partition_key(_payload)
nil
end

# @param size [Integer] Override the default batch size for publishing.
# @return [void]
def max_batch_size(size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't need an actual method here, we can just assign it directly to the config inside configuration.rb.

Copy link
Collaborator Author

@ariana-flipp ariana-flipp May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure what's going on but it seems like producer specs are not working as intended in general. As you suggested I added a line in configuration.rb to allow per-producer level batch size setting like so:

if kafka_config.respond_to?(:bulk_import_id_column) # consumer
    ...
else # new code here for producer
    klass.config[:max_batch_size] = kafka_config.max_batch_size || Deimos.config.producers.max_batch_size
end

I tried setting max_batch_size for a producer in item-feeds and it is working from there. However, self.configure_producer_or_consumer method is never reached for stubbed producers in producer_spec.rb so this config is not being set.

Also, now that I removed max_batch_size method from producer.rb, I'm getting this error when running producer specs:

Deimos::Producer should produce a message
     Failure/Error: max_batch_size 1

NoMethodError:
     undefined method `max_batch_size'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you push your branch? I can take a look.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... we're going to change this entirely in V2 🙃 I forgot that the stubbed producers don't go through the normal process. OK - you can put it back the way it was!

@dorner
Copy link
Member

dorner commented May 13, 2024

LGTM!

@dorner dorner merged commit 51b5093 into master May 13, 2024
5 of 6 checks passed
@dorner dorner deleted the CCOL-2441 branch May 13, 2024 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants