-
Notifications
You must be signed in to change notification settings - Fork 426
Add vfio mode to generate CDI specs for NVIDIA passthrough GPUs #315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@cdesiniotis if this is still valid, please rebase or close. |
|
@varunrsekar would this be interesting for what you're looking at in NVIDIA/k8s-dra-driver-gpu#183? |
pkg/nvcdi/lib-vfio.go
Outdated
| // GetCommonEdits returns common edits for ALL devices. | ||
| // Note, currently there are no common edits. | ||
| func (l *vfiolib) GetCommonEdits() (*cdi.ContainerEdits, error) { | ||
| return &cdi.ContainerEdits{ContainerEdits: &specs.ContainerEdits{}}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing a device node for /dev/vfio/vfio:
&cdi.ContainerEdits{
ContainerEdits: &specs.ContainerEdits{
DeviceNodes: []*specs.DeviceNode{
&specs.DeviceNode{
Path: "/dev/vfio/vfio",
},
},
},
}
yeah this will help! |
|
@varunrsekar I am no longer working on this -- feel free to take ownership of this PR if you need it. |
Pull Request Test Coverage Report for Build 16119484166Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 non blocking nits
…through GPUs Signed-off-by: Christopher Desiniotis <[email protected]>
…77c7a9ac1c84ae8b0aa Signed-off-by: Evan Lezar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new VFIO mode to the nvcdi API and the nvidia-ctk generate command to support NVIDIA GPU passthrough via VFIO.
- Introduces a
ModeVfioconstant and integrates it into the mode resolver. - Adds
WithPCILiboption and dependency onnvpcifor querying PCI devices. - Implements
vfiolibto generate VFIO-based CDI specs, including unit tests.
Reviewed Changes
Copilot reviewed 6 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/nvcdi/options.go | Import nvpci and add WithPCILib option |
| pkg/nvcdi/mode.go | Register ModeVfio in the list of supported modes |
| pkg/nvcdi/lib.go | Wire up nvpcilib in New for ModeVfio |
| pkg/nvcdi/lib-vfio.go | Implement vfiolib and VFIO device spec generation |
| pkg/nvcdi/lib-vfio_test.go | Add unit tests for the new VFIO mode |
| go.mod | Bump github.com/NVIDIA/go-nvlib for PCI library support |
Comments suppressed due to low confidence (3)
pkg/nvcdi/lib-vfio.go:93
- [nitpick] Consider using
%qinstead of%vto quote the invalid ID in the error message for clarity ("invalid channel ID %q: %w").
return nil, fmt.Errorf("invalid channel ID %v: %w", id, err)
pkg/nvcdi/lib-vfio_test.go:28
- Tests cover positive VFIO scenarios but don’t verify that non-
vfio-pcidevices are filtered out or that error paths (e.g., invalid ID parsing, PCI library failures) behave as expected. Consider adding cases for those.
func TestModeVfio(t *testing.T) {
pkg/nvcdi/lib-vfio.go:33
- [nitpick] The field
groupinvfioDeviceis ambiguous — consider renaming it toiommuGroupto make its purpose clearer.
group int
| } | ||
| } | ||
|
|
||
| // WithPCILib sets the pci library to be used for CDI spec generation. |
Copilot
AI
Jul 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment should use the uppercase acronym PCI for consistency with the function name (WithPCILib).
| // WithPCILib sets the pci library to be used for CDI spec generation. | |
| // WithPCILib sets the PCI library to be used for CDI spec generation. |
| DeviceNodes: []*specs.DeviceNode{ | ||
| { | ||
| Path: path, | ||
| HostPath: filepath.Join(l.devRoot, path), |
Copilot
AI
Jul 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When path starts with a slash, filepath.Join(l.devRoot, path) will ignore l.devRoot. You may need to strip the leading slash or construct the host path as filepath.Join(l.devRoot, path[1:]).
| HostPath: filepath.Join(l.devRoot, path), | |
| HostPath: filepath.Join(l.devRoot, path[1:]), |
| DeviceNodes: []*specs.DeviceNode{ | ||
| { | ||
| Path: "/dev/vfio/vfio", | ||
| HostPath: filepath.Join(l.devRoot, "/dev/vfio/vfio"), |
Copilot
AI
Jul 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue here: joining an absolute path to l.devRoot will ignore the root override. Consider constructing this path without the leading slash or using path.Join on relative segments.
| HostPath: filepath.Join(l.devRoot, "/dev/vfio/vfio"), | |
| HostPath: filepath.Join(l.devRoot, "dev/vfio/vfio"), |
|
@elezar Sorry this completely missed my radar. Are you driving this to completion? |
This change adds a vfio mode to the nvcdi API (and nvidia-ctk generate command).