-
-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix handling of invalid base offsets #846
base: master
Are you sure you want to change the base?
Conversation
The more you know. I'll push a patch release with this soon. |
What scenario would this happen in? I can't find anywhere in the Kafka codebase where the DuplicateSequence exception is actually thrown, and in the correct case where the client only has 5 outstanding idempotent batches, the Kafka broker will return the correct offsets/timestamp for duplicate batches |
(I'm also interested if there's an answer for the above question) |
Honestly I can't say definitively, as this was triggered through some long-running fuzz tests against a very unhealthy cluster. Regardless, this patch improves kgo's behavior to never return erroneous offsets on records under any circumstances, which I feel is generally good. |
(To be clear, I think neither Apache Kafka nor RedPanda have ever returned the duplicate sequence number error.) |
@twmb What's your take on this patch? It seems valuable for this client to include all the safeguards of the reference client, even if this code path isn't exercised when running against the Apache Kakfa or Redpanda implementations. |
I think this is fine and I'll merge this, but since it isn't an active problem, I'll roll this into the next release train rather than cutting a patch with just this. |
FYI https://github.com/jepsen-io/redpanda is explicitly designed to handle records with -1 offsets. Though again, I am not sure when/why they are encountered. |
When a successfully produced batch is returned with an invalid base offset (-1) but no error, the records' internal offsets are being miscomputed from this value. An invalid offset is returned with a non-errored batch primarily in situations where a duplicate idempotent produce is detected, but the actual base offset cannot be provided. Apache Kafka's producer client handles this by setting all record offsets to -1 if the batch's base offset is -1.
This patch makes this correction in the kgo producer callback logic, as well as in the
recordToRecord
function (though an invalid offset is probably impossible on the fetch side).