Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vfkit: More robust state management #20506

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

nirs
Copy link
Contributor

@nirs nirs commented Mar 9, 2025

We get and set vfkit state using the HTTP API. This is not very robust since the API is provided by the process we are terminating:

  • Setting the state to HardStop typically fails when vfkit is terminated, leading to unneeded retries and delays
  • Setting the state to Stop may fail if vfkit is already stopped or shutting down, or does not listen to the socket.
  • Getting the state is racy, since vfkit may be shutting down or terminate right after checking the pid (time of check, time of use).
  • The pidfile was not removed in all cases, or removed too late.

This change fixes the issues by using vfkit API only for setting state, and falling back to sending a signal when using the API fails.

New generic helpers added for working with pidfile. I plan to use them for vment package (#20501) and they can also be used by any driver managing pidfile.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 9, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: nirs
Once this PR has been reviewed and has the lgtm label, please assign spowelljr for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 9, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @nirs. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 9, 2025
@minikube-bot
Copy link
Collaborator

Can one of the admins verify this patch?

return nil
}
return nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crc has some similar code https://github.com/crc-org/crc/blob/4bdcc16077a84583ad85ea5b8b75eefab13ba82d/pkg/drivers/vfkit/driver_darwin.go#L403-L455 , would be useful to move this to vfkit instead of each project having its own implementation of this.

if err != nil {
return err
}
log.Infof("Sending signal %q to pid %v", sig, pid)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crc checks that the process name contains vfkit in case the pid file has stale content, and the pid was reused by some unrelated process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a link to this code? this is very much needed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err := d.SetVFKitState("Stop"); err != nil {
// vfkit may be already stopped, shutting down, or not listening.
log.Debugf("Failed to set vfkit state to 'Stop': %s", err)
return signalPidfile(d.pidfilePath(), syscall.SIGTERM)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe remove the pid file here so that GetState returns Stopped after Stop was called?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signalPidfile removes the pid when the process does not exist. But if the signal is sent the process still exists and the caller will discover it by polling on GetState(). Stop() is not responsible for waiting until the process is stopped, the caller is implementing this for all drivers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that if d.SetVFKitState("Stop") succeeds, you have a small window for a race when GetState will return running while vfkit is stopping. It's probably not really important, as GetState will eventually get to a consistent state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use the API to get running state, so we don't have this race. But vfkit docs tells abut VirtualMachineStopping - we don't return this state after calling Stop?

@@ -136,6 +136,49 @@ func (d *Driver) GetIP() (string, error) {
return d.IPAddress, nil
}

func writePidfile(pidfile string, pid int) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there might be a util or lib that we are using in other parts of the code for writing or Locking PID files

if err := d.SetVFKitState("HardStop"); err != nil {
return err
if err := d.SetVFKitState("Stop"); err != nil {
// vfkit may be already stopped, shutting down, or not listening.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still like to see we try to Force Stop using internal command first
and if Vfkit force stop is not doing the right thing we should create an issue
and comment the link to the issue here
so when that issue is fixed we can clean up this code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll create an issue for vfkit and add a link here.

I can add fallback using "HardStop", and then fallback to sending a signal. But the HardStop is very likely to fail so this will only add more failure logs to the process without improving anything. I can post another pr with this change later for testing.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 10, 2025
nirs added 5 commits March 11, 2025 01:21
This will be useful to implement stop and kill with signals.

Remove unneeded os.Stat() call. os.ReadFile() will return
os.ErrNotExists if the file does not exist, so we don't need extra
syscall. This is also fix theoretical race if the pidfile is created
after we check if the file exist.

The new helper is generic like checkPid() and can be reused later by all
drivers managing a pidfile.
For consistency with readPidfile(), and cleaner code. This helper can
also generic and can be used by any driver managing a pidfile.
GetState() had several issues:

- When accessing vfkit HTTP API, we handled only "running",
  "VirtualMachineStateRunning", "stopped", and
  "VirtualMachineStateStopped", but there are other 10 possible states,
  which we handled as state.None, when vfkit is running and need to be
  stopped. This can lead to wrong handling in the caller.

- When handling "stopped" and "VirtualMachineStateStopped" we returned
  state.Stopped, but did not remove the pidfile. This can cause
  termination of unrelated process or reporting wrong status when the
  pid is reused.

- Accessing the HTTP API will fail after we stop or kill it. This
  cause GetState() to fail when the process is actually stopped, which
  can lead to unnecessary retries and long delays (kubernetes#20503).

- When retuning state.None during Remove(), we use tried to do graceful
  shutdown which does not make sense in minikube delete flow, and is not
  consistent with state.Running handling.

Accessing vfkit API to check for state does not add much value for our
used case, checking if the vfkit process is running, and it is not
reliable.

Fix all the issues by not using the HTTP API in GetState(), and use only
the process state. We still use the API for stopping and killing vfkit
to do graceful shutdown. This also simplifies Remove(), since we need to
handle only the state.Running state.

With this change we consider vfkit as stopped only when the process does
not exist, which takes about 3 seconds after the state is reported as
"stopped".

Example stop flow:

    I0309 18:15:40.260249   18857 main.go:141] libmachine: Stopping "minikube"...
    I0309 18:15:40.263225   18857 main.go:141] libmachine: set state: {State:Stop}
    I0309 18:15:46.266902   18857 main.go:141] libmachine: Machine "minikube" was stopped.
    I0309 18:15:46.267122   18857 stop.go:75] duration metric: took 6.127761459s to stop

Example delete flow:

    I0309 17:00:49.483078   18127 out.go:177] * Deleting "minikube" in vfkit ...
    I0309 17:00:49.499252   18127 main.go:141] libmachine: set state: {State:HardStop}
    I0309 17:00:49.569938   18127 lock.go:35] WriteFile acquiring /Users/nir/.kube/config: ...
    I0309 17:00:49.573977   18127 out.go:177] * Removed all traces of the "minikube" cluster.
If setting vfkit state to "Stop" fails, we used to return an error.
Retrying the operation may never succeed.

Fix by falling back to terminating vfkit using SIGTERM. This should be
handled as graceful shutdown in vfkit if it is implemented correctly.

We can still fail if the pidfile is corrupted. Getting process from the
pid returns an error, but it is documented to "always succeed on unix".

In the case when we are sure the vfkit process does not exist, we remove
the pidfile immediately, avoiding leftover pidfile if the caller does
not call GetState() after Stop().

Sending a signal using a pidfile can work for any driver using a
pidfile, so it is implemented as a reusable function.
We know that setting the state to `HardStop` typically fails:

    I0309 19:19:42.378591   21795 out.go:177] 🔥  Deleting "minikube" in vfkit ...
    W0309 19:19:42.397472   21795 delete.go:106] remove failed, will retry: kill: Post "http://_/vm/state": EOF

This may lead to unnecessary retries and delays. Fix by falling back to
sending a SIGKILL signal.

Example delete flow when setting vfkit state fails:

    I0309 20:07:41.688259   25540 out.go:177] 🔥  Deleting "minikube" in vfkit ...
    I0309 20:07:41.712017   25540 main.go:141] libmachine: Failed to set vfkit state to 'HardStop': Post "http://_/vm/state": EOF
    I0309 20:07:41.712082   25540 main.go:141] libmachine: Sending signal "killed" to pid 25482
@nirs nirs force-pushed the vfkit-state-fixes branch from a23597f to 91bd6f2 Compare March 10, 2025 23:21
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 10, 2025
}
log.Infof("Sending signal %q to pid %v", sig, pid)
if err := process.Signal(sig); err != nil {
if err != os.ErrProcessDone {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be more portable to use process.Kill() instead of process.Signal(SIGKILL).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants