Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CSI] trigger StopEndpoint if StartEndpoint has failed with GRPC Timeout error #2802

Merged
merged 2 commits into from
Jan 13, 2025

Conversation

antonmyagkov
Copy link
Collaborator

@antonmyagkov antonmyagkov commented Jan 6, 2025

issue: #2801

All Start/StopEndpoint requests are landed into the same queue: nbs/cloud/blockstore/libs/storage/service/volume_session_actor_mount.cpp at main · ydb-platform/nbs . NBS handles only one request at the same time.

StartEndpoint request from csi driver can fail by timeout however request in the queue will retry to start endpoint longer time. It leads to hanging delete volume operation as csi driver doesn’t send stop endpoint request(NodeVolume publish fails so we don’t need to send NodeUnpublishVolume).

Solution:
Send StopEndpoint request(in NodePublishVolume/NodeStageVolume) if StartEndpoint request fails with GRPC timeout error.

@antonmyagkov antonmyagkov added the blockstore Add this label to run only cloud/blockstore build and tests on PR label Jan 6, 2025
Copy link
Contributor

github-actions bot commented Jan 6, 2025

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-relwithdebinfo: all tests PASSED for commit 0f788d5.

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
3560 3560 0 0 0 0

@antonmyagkov antonmyagkov force-pushed the users/myagkov/issue-2801 branch from a46ee7f to 08add02 Compare January 7, 2025 17:06
Copy link
Contributor

github-actions bot commented Jan 7, 2025

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-relwithdebinfo: all tests PASSED for commit 08add02.

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
3562 3562 0 0 0 0

@antonmyagkov antonmyagkov requested a review from tpashkin January 7, 2025 17:59
@antonmyagkov antonmyagkov added the rebase Add this label if you want to rebase your PR for test run label Jan 12, 2025
Copy link
Contributor

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-relwithdebinfo: all tests PASSED for commit 08add02.

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
3571 3571 0 0 0 0

@antonmyagkov antonmyagkov merged commit 48a64e3 into main Jan 13, 2025
21 of 22 checks passed
@antonmyagkov antonmyagkov deleted the users/myagkov/issue-2801 branch January 13, 2025 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blockstore Add this label to run only cloud/blockstore build and tests on PR rebase Add this label if you want to rebase your PR for test run
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants