-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling no coordinator and data loss in ICR mode #12372
base: main
Are you sure you want to change the base?
Handling no coordinator and data loss in ICR mode #12372
Conversation
close(); | ||
// We need to close worker here in every case to ensure exactly once. | ||
committer.stop(ResourceType.WORKER); | ||
if(committer.isLeader(partitions)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name isLeader
now that it has been reworked, we should rename it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure will incorporate that.
|
||
package org.apache.iceberg.connect; | ||
|
||
public enum ResourceType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this enum, we can just have methods for each in the committer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah thought that initially, did not do as it would have made a lot of methods in the committer but will make that change
worker = new Worker(config, clientFactory, sinkWriter, context); | ||
worker.start(); | ||
@Override | ||
public void start(ResourceType resourceType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just call the type-specific methods directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
coordinatorThread.terminate(); | ||
coordinatorThread = null; | ||
@Override | ||
public void stop(ResourceType resourceType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, we can just call the type-specific methods directly instead.
public void start(Catalog catalog, IcebergSinkConfig config, SinkTaskContext context) { | ||
KafkaClientFactory clientFactory = new KafkaClientFactory(config.kafkaProps()); | ||
|
||
public boolean isLeader(Collection<TopicPartition> currentAssignedPartitions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should rename this now that its functionality has changed
We should add some comments, and also tests if feasible. Also looks like there are some code formatting issues. |
|
||
public interface Committer { | ||
void start(Catalog catalog, IcebergSinkConfig config, SinkTaskContext context); | ||
void startCoordinator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I should have brought up sooner, the Committer
interface is meant to be generic, i.e. you could plug in a committer that uses server-side coordination, and in that case these new methods aren't relevant. Also, this is modifying a public interface, and that could be a breaking change for some.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do not want to modify the interface, we somehow need to pass the current partitions either added or removed for the committer to handle the coordination start and termination logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since commiter is being only used in the sink task, and committerImpl is only class which is implementing this interface, I did not get how this will break the compatibility. Also are we planning to expose a config for providing implementation of committer. Also what is meant by "server-side coordination" here.
|
||
void save(Collection<SinkRecord> sinkRecords); | ||
|
||
boolean isCoordinator(Collection<TopicPartition> currentAssignedPartitions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the naming here is accurate either, i.e. this could return false even for a task that is the coordinator node. Also, this isn't relevant to other types of committers that aren't using a coordinator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please suggest some name for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this would return false for a node that is coordinator. This would not return false for a coordinator node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a coordinator node calls close
with the partitions being closed, and the partitions don't contain the first partition, then it will return false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, isn't that correct, since it did not contained the partition 0 then committer is telling that you are not the coordinator. Hence the name isCoordinator. Or I can alternatively name it is amITheCoordinator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is similar to the kafka where they call the method isLeader() to check if the consumer is the leader of the group. Likewise the task is check isCoordinator.
committer = CommitterFactory.createCommitter(config); | ||
committer.start(catalog, config, context); | ||
// We should be starting co-ordinator only the list of partitions has the zeroth partition. | ||
if(committer.isCoordinator(partitions)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should move this logic into the committer implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to move this logic to the committer then we need to somehow pass the the partitions information and that would require modification to the commuter method signatures
// We need to close worker here in every case to ensure exactly once otherwise this will lead to duplicate records. | ||
committer.stopWorker(); | ||
// Coordinator should only be closed if this received closed partitions has the partition which elected this task as coordinator. | ||
if(committer.isCoordinator(partitions)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, we should move this into the committer implementation rather than having it in the sink task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to move this logic to the committer then we need to somehow pass the the partitions information and that would require modification to the commuter method signatures
// We need to close worker here in every case to ensure exactly once otherwise this will lead to duplicate records. | ||
committer.stopWorker(); | ||
// Coordinator should only be closed if this received closed partitions has the partition which elected this task as coordinator. | ||
if(committer.isCoordinator(partitions)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, we should move this into the committer implementation rather than having it in the sink task.
} | ||
|
||
private void close() { | ||
if (committer != null) { | ||
committer.stop(); | ||
committer.stopWorker(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, we should keep a generic stop
and let the committer implementation decide what to do.
Few Observations:
In the ICR mode "open" receives only the "newly added partitions", and "open" will "not be called" by connect framework if there are No New Partitions Assigned to the Task.
Similarly in case of "Close" the Task receives only the "removed partitions" but we blindly close the "Co-ordinator".
The coordinator is created only in case "open" is called but in case when a partition is revoked and no partition is added on the task then only close will be called with that revoked partition without any open call.
How this is leading to NO-Coordinator Scenario:
Let's see this with the below example:
Initially we had one worker "W0" with two tasks "T0" and "T1" consuming from two partitions of one topic namely "P0" and "P1", so the initial configuration is:
W0 -> [{T0,P0}, {T1, P1}] -> this will elect "T0" as the co-ordinator as it has "P0"
Now another worker "W1" joins:-> this will lead to rebalancing on both tasks as well as topic partitions within those tasks.
State at this point of time:
W0 -> [{T0,[P0, P1]}]
W1 -> []
Now,
Assume P1 is removed from T0:
Hence this leads to a No-Coordinator scenario.
Data Loss Scenario:
In Incremental Cooperative Rebalancing (ICR) mode, when rebalance happens, consumers do not stop consuming as their is no stop the world like in "Eager" mode of rebalancing.
In case a partition is removed from a task, Consumer co-ordinator calls "close(Collection()) of the sink Task. In this call since we are blindly dumping all the files, this will dump the records also for the partitions still retained by this task. Moreover close call will make the committer null and since we have not null check for commiter in the put(Collection) , this will silently ignore the records. Once we get an open(Collection) call from Kafka this will start the commiter without resetting the offsets which leads to data loss.
Document explaining both the scenario: https://docs.google.com/document/d/1okqGq1HXu2rDnq88wIlVDv0EmNFZYB1PhgwyAzWIKT8/edit?tab=t.0#heading=h.51qcys2ewbsa