Skip to content

Batching support#54

Open
fmterrorf wants to merge 6 commits into
commanded:masterfrom
calmwave-open-source:batching-support
Open

Batching support#54
fmterrorf wants to merge 6 commits into
commanded:masterfrom
calmwave-open-source:batching-support

Conversation

@fmterrorf

Copy link
Copy Markdown

In conjuction commanded/commanded#569 adds support for batching Ecto projections.

nikkocampbell and others added 6 commits September 18, 2024 15:43
Added tests and relevant fixes to batch projector

added after update callback test

Update dependencies

docs

temporary commanded version

Skip over partial seen batch

remove elixir_uuid after rebase

Return error on partially seen batch again
@fmterrorf fmterrorf marked this pull request as ready for review September 20, 2024 21:42
@cdegroot

Copy link
Copy Markdown

Note: review can wait until Calmwave has dogfooded this for a bit.

@drteeth drteeth left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work on this feature. It looks great, well done.

As this PR relies on upstream changes to Commanded, it will have to wait until that work is merged. See commanded/commanded#569

Before calling the PR done, the reference to the calmwave fork would need to be dropped of course.

@anderslemke

Copy link
Copy Markdown

I'm curious what the status on this is. What are the experience in Calmwave, @cdegroot ?

Also, I have a question:

Let's say I set batch size to 100.
Now, let's assume that the projection is up to date, and a new event is added.
Will I get the call to project_batch right away, or will it wait until we have another 99 events ready for the batch?

@yordis

yordis commented Nov 18, 2025

Copy link
Copy Markdown
Contributor

@anderslemke The subscription flushes on timeout (milliseconds) if fewer events are available, so batch_size: 100 won't wait for 99 more events.

@yordis yordis left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed a few issues with partial batch handling:

  1. No event filtering - When a batch contains already-seen events (e.g., events [2,3,4] when watermark is 2), the user projection receives all events instead of just [3,4]. This can cause duplicate projections.

  2. No locking - The watermark check and update aren't atomic. Two concurrent batches could both pass the check and cause race conditions.

  3. Test verification - The test at lines 74-91 should verify that "e4" was actually projected, not just that the call succeeded.

The core issue is that apply(multi_fn, [multi]) at line 152 passes the multi but not the filtered events, so the user's lambda processes the original batch from closure scope.

A robust approach would be: lock → filter unseen events → update watermark → pass only unseen events to user projection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants