You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After a rolling upgrade or restart of an application using an Azure Event Hubs binding (or pub sub), a new client identifier (a uuid/guid) is generated by default for the underlying Azure SDK Client if it is not provided (be that the golang, dotnet, java, whatever version.). This same identifier is used in blob storage metadata for checkpoint management to signify ownership by a particular client in a pool of clients (or a singular client.) This is the mechanism by which partition balancing is performed among a group of competing clients for a given consumer group. The default timeout for a lease is two minutes. A client renews its lease by updating the modified time on the placeholder blob.
Using the example of a single client bound to an event hub with 32 partitions, each time you restart, the client will slowly acquire leases on these partitions as each lease reaches the two minute timeout. Until a lease is acquired, _no messages in that partition will be processed. As far as this single client is concerned, it has no idea whether the old guid is an existing competing client or it's own former client/owner id. In my experience, it can take up to 5 minutes to reclaim all partitions with a single client in a dev environment. This is painful :/
example metadata signifying client ownership over partition 3 for event hub "device-in"
Solution
Allowing the client to have a "static" identifier means that when the client restarts, it can rapidly begin start receiving messages as it believes it already owns all the leases because the blob metadata has the same identifier.
In production scenarios, typically you would bind the identifier to the pod name in a k8s stateful set (not replica) to ensure heritage for a given client. This should be easily done via a secretRef with an environment bound secret store (or maybe there's a better way?)
References
In the Go SDK, this property seems to be aliased as InstanceID and ClientID across the client options.
It is recommended that you set a stable unique identifier for processor instances, as this allows the processor to recover partition ownership when an application or host instance is restarted. It also aids readability in Azure SDK logs and allows for more easily correlating logs to a specific processor instance
Release Note
I would suggest repeating the Microsoft SDK information, and suggesting using stateful sets and pod names as the identifier for production. A fixed identifier is very useful for dev, especially if you have multiple partitions.
RELEASE NOTE:
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.
In what area(s)?
/area runtime
/area operator
AEH Binding, AEH PubSub
Describe the feature
Problem Statement
After a rolling upgrade or restart of an application using an Azure Event Hubs binding (or pub sub), a new client identifier (a uuid/guid) is generated by default for the underlying Azure SDK Client if it is not provided (be that the golang, dotnet, java, whatever version.). This same identifier is used in blob storage metadata for checkpoint management to signify ownership by a particular client in a pool of clients (or a singular client.) This is the mechanism by which partition balancing is performed among a group of competing clients for a given consumer group. The default timeout for a lease is two minutes. A client renews its lease by updating the modified time on the placeholder blob.
Using the example of a single client bound to an event hub with 32 partitions, each time you restart, the client will slowly acquire leases on these partitions as each lease reaches the two minute timeout. Until a lease is acquired, _no messages in that partition will be processed. As far as this single client is concerned, it has no idea whether the old guid is an existing competing client or it's own former client/owner id. In my experience, it can take up to 5 minutes to reclaim all partitions with a single client in a dev environment. This is painful :/
example metadata signifying client ownership over partition 3 for event hub "device-in"
Solution
Allowing the client to have a "static" identifier means that when the client restarts, it can rapidly begin start receiving messages as it believes it already owns all the leases because the blob metadata has the same identifier.
In production scenarios, typically you would bind the identifier to the pod name in a k8s stateful set (not replica) to ensure heritage for a given client. This should be easily done via a secretRef with an environment bound secret store (or maybe there's a better way?)
References
In the Go SDK, this property seems to be aliased as InstanceID and ClientID across the client options.
https://github.com/Azure/azure-sdk-for-go/blob/sdk/messaging/azeventhubs/v1.2.2/sdk/messaging/azeventhubs/consumer_client.go#L156
Microsoft SDK quote:
Release Note
I would suggest repeating the Microsoft SDK information, and suggesting using stateful sets and pod names as the identifier for production. A fixed identifier is very useful for dev, especially if you have multiple partitions.
RELEASE NOTE:
The text was updated successfully, but these errors were encountered: