-
Notifications
You must be signed in to change notification settings - Fork 524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DRAFT] Nvidia Settings API changes #4125
Conversation
Release.toml
Outdated
@@ -1,4 +1,4 @@ | |||
version = "1.21.0" | |||
version = "1.21.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically we would not add new settings to the API in a point release. These changes should target 1.22. But don't do the release version bump in a feature PR, it's not really related to your feature and creates some churn.
sources/settings-migrations/v1.21.1/nvidia-k8s-device-plugin-settings/src/main.rs
Outdated
Show resolved
Hide resolved
sources/settings-migrations/v1.21.1/nvidia-k8s-device-plugin-metadata/src/main.rs
Outdated
Show resolved
Hide resolved
sources/settings-migrations/v1.21.1/nvidia-container-runtime-settings/src/main.rs
Show resolved
Hide resolved
sources/settings-migrations/v1.21.1/nvidia-k8s-device-plugin-settings/src/main.rs
Show resolved
Hide resolved
4b97b65
to
20f5ffc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good apart from the parts that need to be reverted or cleaned up.
It'd be good to test a non-nvidia aws-k8s
variant to confirm that the device plugin settings aren't recognized, which would indicate that the feature flag wasn't used at build time.
-p settings-plugin-aws-k8s-nvidia \ | ||
%{nil} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove one of the two newlines:
|
||
%description aws-k8s-nvidia | ||
%{summary}. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid adding unnecessary whitespace:
I spoke with @monirul yesterday about an idea to programmatically verify that feature unification has not taken place. Since we need to do a settings-sdk release for the new models anyways, I think it would be a good idea to make the requisite changes their too. The basic idea is:
|
20f5ffc
to
6c7ab5f
Compare
6c7ab5f
to
8adc19c
Compare
This was superseded by #4182. |
Issue number:
Closes #
Description of changes:
This PR introduces new settings API for Nvidia GPUs for Kubernetes Nvidia variants.
New settings are
settings.nvidia-container-runtime.visible-devices-as-volume-mounts
accept-nvidia-visible-devices-as-volume-mounts
value for k8s container-toolkittrue
|false
default:true
settings.nvidia-container-runtime.visible-devices-envvar-when-unprivileged
accept-nvidia-visible-devices-envvar-when-unprivileged
settings of nvidia container runtime for k8s varienttrue
|false
default:false
settings.kubernetes.device-plugins.nvidia.pass-device-specs
pass-device-specs
settings of the device plugin that pass the list of DeviceSpecs to the kubelet on Allocatetrue
|false
default:true
settings.kubernetes.device-plugins.nvidia.device-id-strategy
device-id-strategy
settings of the device plugin which specifies how GPUs are identified and selected for workloads running in a Kubernetes clusteruuid
|index
Default:index
settings.kubernetes.device-plugins.nvidia.device-list-strategy
device-list-strategy
setting in NVIDIA Kubernetes device plugins. It is used to configure how GPUs are listed and allocated to pods in a Kubernetes clusterenvvar
|volume-mounts
default:volume-mounts
Testing done:
Yes.
Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.