Skip to content

Commit

Permalink
vfkit: Simpler and more robust GetState()
Browse files Browse the repository at this point in the history
GetState() had several issues:

- When accessing vfkit HTTP API, we handled only "running",
  "VirtualMachineStateRunning", "stopped", and
  "VirtualMachineStateStopped", but there are other 10 possible states,
  which we handled as state.None, when vfkit is running and need to be
  stopped. This can lead to wrong handling in the caller.

- When handling "stopped" and "VirtualMachineStateStopped" we returned
  state.Stopped, but did not remove the pidfile. This can cause
  termination of unrelated process or reporting wrong status when the
  pid is reused.

- Accessing the HTTP API will fail after we stop or kill it. This
  cause GetState() to fail when the process is actually stopped, which
  can lead to unnecessary retries and long delays (kubernetes#20503).

- When retuning state.None during Remove(), we use tried to do graceful
  shutdown which does not make sense in minikube delete flow, and is not
  consistent with state.Running handling.

Accessing vfkit API to check for state does not add much value for our
used case, checking if the vfkit process is running, and it is not
reliable.

Fix all the issues by not using the HTTP API in GetState(), and use only
the process state. We still use the API for stopping and killing vfkit
to do graceful shutdown. This also simplifies Remove(), since we need to
handle only the state.Running state.

With this change we consider vfkit as stopped only when the process does
not exist, which takes about 3 seconds after the state is reported as
"stopped".

Example stop flow:

    I0309 18:15:40.260249   18857 main.go:141] libmachine: Stopping "minikube"...
    I0309 18:15:40.263225   18857 main.go:141] libmachine: set state: {State:Stop}
    I0309 18:15:46.266902   18857 main.go:141] libmachine: Machine "minikube" was stopped.
    I0309 18:15:46.267122   18857 stop.go:75] duration metric: took 6.127761459s to stop

Example delete flow:

    I0309 17:00:49.483078   18127 out.go:177] * Deleting "minikube" in vfkit ...
    I0309 17:00:49.499252   18127 main.go:141] libmachine: set state: {State:HardStop}
    I0309 17:00:49.569938   18127 lock.go:35] WriteFile acquiring /Users/nir/.kube/config: ...
    I0309 17:00:49.573977   18127 out.go:177] * Removed all traces of the "minikube" cluster.
  • Loading branch information
nirs committed Mar 10, 2025
1 parent 3b00042 commit 902da15
Showing 1 changed file with 1 addition and 16 deletions.
17 changes: 1 addition & 16 deletions pkg/drivers/vfkit/vfkit.go
Original file line number Diff line number Diff line change
Expand Up @@ -175,17 +175,7 @@ func (d *Driver) GetState() (state.State, error) {
os.Remove(pidfile)
return state.Stopped, nil
}
ret, err := d.GetVFKitState()
if err != nil {
return state.Error, err
}
switch ret {
case "running", "VirtualMachineStateRunning":
return state.Running, nil
case "stopped", "VirtualMachineStateStopped":
return state.Stopped, nil
}
return state.None, nil
return state.Running, nil
}

func (d *Driver) Create() error {
Expand Down Expand Up @@ -341,11 +331,6 @@ func (d *Driver) Remove() error {
return errors.Wrap(err, "kill")
}
}
if s != state.Stopped {
if err := d.SetVFKitState("Stop"); err != nil {
return errors.Wrap(err, "quit")
}
}
return nil
}

Expand Down

0 comments on commit 902da15

Please sign in to comment.