-
Notifications
You must be signed in to change notification settings - Fork 869
feat(shard-distributor): Add ping verification to ephemeral shard creator #7496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
jakobht
wants to merge
14
commits into
cadence-workflow:master
Choose a base branch
from
jakobht:add-ephemeral-shard-ping
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
feat(shard-distributor): Add ping verification to ephemeral shard creator #7496
jakobht
wants to merge
14
commits into
cadence-workflow:master
from
jakobht:add-ephemeral-shard-ping
+884
−62
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
a043f8b to
f9d6569
Compare
7a49361 to
4b3f347
Compare
Implement a YARPC peer chooser that routes requests to the correct
executor based on shard ownership. This is the shard distributor
equivalent of Cadence's RingpopPeerChooser.
Flow:
1. Client calls RPC with yarpc.WithShardKey("shard-key")
2. Chooser queries Spectator for shard owner
3. Extracts grpc_address from owner metadata
4. Creates/reuses peer for that address
5. Returns peer to YARPC for connection
The peer chooser maintains a cache of peers and handles concurrent
access safely. It uses the x-shard-distributor-namespace header to
determine which namespace's spectator to query.
Dependencies:
- Requires spectator GetShardOwner to return metadata (see previous commit)
Signed-off-by: Jakob Haahr Taankvist <[email protected]>
Signed-off-by: Jakob Haahr Taankvist <[email protected]>
The tests have dependency issues with mock generation that need to be resolved separately. The peer chooser implementation is complete and functional. Signed-off-by: Jakob Haahr Taankvist <[email protected]>
Tests cover: - Success path with peer creation - Peer reuse on subsequent calls - Error cases (missing shard key, namespace header, spectator not found) - Lifecycle methods (Start, Stop, IsRunning) - SetSpectators method Signed-off-by: Jakob Haahr Taankvist <[email protected]>
Signed-off-by: Jakob Haahr Taankvist <[email protected]>
Signed-off-by: Jakob Haahr Taankvist <[email protected]>
…peer creation and mutex lock and unlock Signed-off-by: Jakob Haahr Taankvist <[email protected]>
Implement the client-side pinger that periodically pings random shard owners to verify: 1. Executors can route to each other based on shard ownership 2. Shard ownership information is accurate 3. The shard distributor is functioning correctly The pinger: - Selects random shards at regular intervals (1s with 10% jitter) - Sends ping requests to the executor owning each shard - Validates that the receiving executor actually owns the shard - Logs warnings when ownership is incorrect Dependencies: - Requires ShardDistributorExecutorCanaryAPI proto and client - Will use SpectatorPeerChooser for routing (wired in later commit) Signed-off-by: Jakob Haahr Taankvist <[email protected]>
Signed-off-by: Jakob Haahr Taankvist <[email protected]>
Signed-off-by: Jakob Haahr Taankvist <[email protected]>
80a9f46 to
c5e5cc9
Compare
After creating a shard, the ephemeral shard creator now pings the owner to verify that: 1. The shard was created successfully 2. The owner can be reached via gRPC 3. The owner actually owns the shard This provides end-to-end validation of the shard creation and routing mechanisms in the canary test environment. Changes: - Switch from using ShardDistributor client to using Spectators - Add pingShardOwner() method that sends canary ping after creation - Verify executor ID and ownership in ping response - Log warnings if executor ID mismatches or ownership is incorrect - Add test coverage for ping verification flow - Use refactored *spectatorclient.Spectators type Signed-off-by: Jakob Haahr Taankvist <[email protected]>
Integrate all canary components into the module: - Create SpectatorPeerChooser and manage its lifecycle - Provide canary client using the dispatcher - Create and start the pinger component - Register ping handler as a YARPC server - Wire spectators to peer chooser on startup This connects all the pieces needed for executor-to-executor canary testing via ping/pong requests. Dependencies: - Requires SpectatorPeerChooser - Requires PingHandler - Requires Pinger component Signed-off-by: Jakob Haahr Taankvist <[email protected]>
Signed-off-by: Jakob Haahr Taankvist <[email protected]>
c5e5cc9 to
ae1cf2b
Compare
Signed-off-by: Jakob Haahr Taankvist <[email protected]>
5dc1df9 to
0046451
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changed?
The ephemeral shard creator now pings shard owners immediately after creation to verify end-to-end functionality.
Why?
This adds canary verification that validates:
Previously, the shard creator only verified that GetShardOwner returned successfully, but didn't verify that the executor was actually reachable or owned the shard.
How did you test it?
Unit tests
Potential risks
Low risk - this only affects the canary test environment and adds verification without changing core shard creation logic.
Documentation