Skip to content

Conversation

@ludfjig
Copy link
Contributor

@ludfjig ludfjig commented Sep 30, 2025

This is PR 2/3 in a bigger effort to remove duplicate code across drivers.

depends on #907 which must be merged first, will mark this PR as ready then

This PR introduces

  • Vm trait. It's a minimal trait for common functionality of a minimal Vm. It abstracts over differences in kvm, mshv, whp. This traits only knows things like set/get registers, run, but nothing about guest functions or hyperlight specifics.
  • HyperlightVm struct. This is a struct that contains the dyn Vm above, as well as things like guest_ptr, rsp, memory-regions, gdb connections, etc. You can think of this as replacing the previous Hypervisor trait (but now it's just 1 struct to avoid duplicate code). HyperlightVm knows about initialization, dispatching guest calls, gdb-debugging etc, guest-tracing, which Vm trait doesn't.
  • Simplifies and refactors some cancellation stuff relating to kill() without changing behavior

When reviewing, new file hyperlight_vm.rs should be compared against old kvm.rs, hyperv_linux.rs, hyperv_windows.rs

Closes #465, #904

@ludfjig ludfjig force-pushed the vm_trait_new branch 3 times, most recently from 81f0d54 to 62fad87 Compare October 22, 2025 19:44
@ludfjig ludfjig added the kind/refactor For PRs that restructure or remove code without adding new functionality. label Oct 22, 2025
@ludfjig ludfjig force-pushed the vm_trait_new branch 17 times, most recently from 1562f26 to edc7f00 Compare October 24, 2025 20:15
@ludfjig ludfjig force-pushed the vm_trait_new branch 4 times, most recently from f6337f9 to 350b8e0 Compare October 28, 2025 18:05
@ludfjig ludfjig marked this pull request as ready for review October 28, 2025 18:39
@ludfjig ludfjig force-pushed the vm_trait_new branch 5 times, most recently from d2e6bf8 to afa720d Compare November 4, 2025 00:55
Copy link
Contributor

@jsturtevant jsturtevant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time looking at these changes. There seem to be alot of really good things in here. My main concern is that it is a very big change that encompases new features, changes, and simplifications across multiple implementations.

My initial thought would be break this down in to small chunks: Add a VM trait, and implement for a single platform. Then move each platform over, this would keep the set of changes smaller. I would also think that we could keep things simplier to review by not doing updates to implementations like the changes for "Simplifies and refactors some cancellation stuff" and have those as seperate change sets

Copy link
Contributor

@dblnz dblnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work!
I like that we get rid of duplicated code, and we now have a clear separation of common Vm code and specific Vm logic.
It is definitely the right direction we want to go.

I left some small comments.
I need to emphasize the fact that being a big change, it is difficult to follow where code was moved.

)
)]
Retry(),
#[cfg(gdb)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Invert the comment and #[cfg..

Suggested change
#[cfg(gdb)]
/// The vCPU has exited due to a debug event (usually breakpoint)
#[cfg(gdb)]

use crate::sandbox::uninitialized::SandboxRuntimeConfig;
use crate::{HyperlightError, Result, log_then_return, new_error};

pub(crate) struct HyperlightVm {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments here explaining what this struct wants to achieve and why we need it would be helpful.
I assume this replaces the old Hypervisor

// Architectures Software Developer's Manual
if dr6 & DR6_BS_FLAG_MASK != 0 && single_step {
return VcpuStopReason::DoneStep;
if dr6 & DR6_BS_FLAG_MASK != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to remove the single_step variable here. It was only an additional state for verifying the correct flag was in sync with internal state of debugging


if BP_EX_ID == exception && sw_breakpoints.contains_key(&rip) {
return VcpuStopReason::SwBp;
if BP_EX_ID == exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might remove an option here.

If the vCPU stops because of an issue, not a SW/HW breakpoint set by the debugger, it will not be transmitted as an unknown breakpoint.

I am not sure though if this means that the debugger won't show it as an exception.

Copy link
Contributor Author

@ludfjig ludfjig Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #BP is only ever raised by INT3, so checking sw breakpoints is redundant?

// Mark that a guest function call is now active
// (This also increments the generation counter internally)
// The guard will automatically clear call_active when dropped
let _guard = CallActiveGuard::new(self.vm.interrupt_handle())?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did this go to? Is call_active still being used? What are the other changes related to this?

Copy link
Contributor Author

@ludfjig ludfjig Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope it's all gone. I was not a big fan of the original logic but we still merged it in favor of velocity. All logic related to cancellation is now in HyperlightVm::run. It's been greatly simplified

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/refactor For PRs that restructure or remove code without adding new functionality.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rethink driver API

3 participants