-
Notifications
You must be signed in to change notification settings - Fork 374
Add vfio mode to generate CDI specs for NVIDIA passthrough GPUs #315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@cdesiniotis if this is still valid, please rebase or close. |
@varunrsekar would this be interesting for what you're looking at in NVIDIA/k8s-dra-driver-gpu#183? |
pkg/nvcdi/lib-vfio.go
Outdated
// GetCommonEdits returns common edits for ALL devices. | ||
// Note, currently there are no common edits. | ||
func (l *vfiolib) GetCommonEdits() (*cdi.ContainerEdits, error) { | ||
return &cdi.ContainerEdits{ContainerEdits: &specs.ContainerEdits{}}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing a device node for /dev/vfio/vfio
:
&cdi.ContainerEdits{
ContainerEdits: &specs.ContainerEdits{
DeviceNodes: []*specs.DeviceNode{
&specs.DeviceNode{
Path: "/dev/vfio/vfio",
},
},
},
}
yeah this will help! |
@varunrsekar I am no longer working on this -- feel free to take ownership of this PR if you need it. |
Pull Request Test Coverage Report for Build 16119484166Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 non blocking nits
…through GPUs Signed-off-by: Christopher Desiniotis <[email protected]>
…77c7a9ac1c84ae8b0aa Signed-off-by: Evan Lezar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new VFIO mode to the nvcdi API and the nvidia-ctk generate
command to support NVIDIA GPU passthrough via VFIO.
- Introduces a
ModeVfio
constant and integrates it into the mode resolver. - Adds
WithPCILib
option and dependency onnvpci
for querying PCI devices. - Implements
vfiolib
to generate VFIO-based CDI specs, including unit tests.
Reviewed Changes
Copilot reviewed 6 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
pkg/nvcdi/options.go | Import nvpci and add WithPCILib option |
pkg/nvcdi/mode.go | Register ModeVfio in the list of supported modes |
pkg/nvcdi/lib.go | Wire up nvpcilib in New for ModeVfio |
pkg/nvcdi/lib-vfio.go | Implement vfiolib and VFIO device spec generation |
pkg/nvcdi/lib-vfio_test.go | Add unit tests for the new VFIO mode |
go.mod | Bump github.com/NVIDIA/go-nvlib for PCI library support |
Comments suppressed due to low confidence (3)
pkg/nvcdi/lib-vfio.go:93
- [nitpick] Consider using
%q
instead of%v
to quote the invalid ID in the error message for clarity ("invalid channel ID %q: %w"
).
return nil, fmt.Errorf("invalid channel ID %v: %w", id, err)
pkg/nvcdi/lib-vfio_test.go:28
- Tests cover positive VFIO scenarios but don’t verify that non-
vfio-pci
devices are filtered out or that error paths (e.g., invalid ID parsing, PCI library failures) behave as expected. Consider adding cases for those.
func TestModeVfio(t *testing.T) {
pkg/nvcdi/lib-vfio.go:33
- [nitpick] The field
group
invfioDevice
is ambiguous — consider renaming it toiommuGroup
to make its purpose clearer.
group int
@@ -43,6 +44,13 @@ func WithInfoLib(infolib info.Interface) Option { | |||
} | |||
} | |||
|
|||
// WithPCILib sets the pci library to be used for CDI spec generation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment should use the uppercase acronym PCI for consistency with the function name (WithPCILib
).
// WithPCILib sets the pci library to be used for CDI spec generation. | |
// WithPCILib sets the PCI library to be used for CDI spec generation. |
Copilot uses AI. Check for mistakes.
DeviceNodes: []*specs.DeviceNode{ | ||
{ | ||
Path: path, | ||
HostPath: filepath.Join(l.devRoot, path), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When path
starts with a slash, filepath.Join(l.devRoot, path)
will ignore l.devRoot
. You may need to strip the leading slash or construct the host path as filepath.Join(l.devRoot, path[1:])
.
HostPath: filepath.Join(l.devRoot, path), | |
HostPath: filepath.Join(l.devRoot, path[1:]), |
Copilot uses AI. Check for mistakes.
DeviceNodes: []*specs.DeviceNode{ | ||
{ | ||
Path: "/dev/vfio/vfio", | ||
HostPath: filepath.Join(l.devRoot, "/dev/vfio/vfio"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue here: joining an absolute path to l.devRoot
will ignore the root override. Consider constructing this path without the leading slash or using path.Join
on relative segments.
HostPath: filepath.Join(l.devRoot, "/dev/vfio/vfio"), | |
HostPath: filepath.Join(l.devRoot, "dev/vfio/vfio"), |
Copilot uses AI. Check for mistakes.
This change adds a vfio mode to the nvcdi API (and nvidia-ctk generate command).