-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Ensure delete.sh script works properly with implicit namespace #217
Conversation
Motivation The delete.sh cleanup script currently doesn't require a namespace to be provided via the -n option and uses the current kubectl context's namespace otherwise. The $namespace variable wasn't set in this case however meaning later parts of the script might not work as intended. Modifications Ensure $namespace variable is set correctly either way. Result delete.sh script works properly when -n option isn't used. Signed-off-by: Nick Hill <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: chinhuang007, njhill The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@njhill -- Sorry I am late to the party. I just tried the new script and I am not getting the result I was expecting. I ran the script without the namespace flag in the default namespace. User error? I start out with a working "quickstart" deployment: $ kubectl get ServingRuntimes
NAME DISABLED MODELTYPE CONTAINERS AGE
mlserver-0.x sklearn mlserver 11d
ovms-1.x openvino_ir ovms 11d
triton-2.x keras triton 11d Set the current namespace to $ kubectl config set-context --current --namespace=default
Context "ckadner-modelmesh-dev/cbu3tugd0r5j02te049g" modified. No $ kubectl get ServingRuntimes
No resources found in default namespace. Run delete script without namespace option $ ./scripts/delete.sh
current namespace: default
deleting in namespace: default
~/GolandProjects/modelmesh-serving_ckadner/config/default ~/GolandProjects/modelmesh-serving_ckadner/config
~/GolandProjects/modelmesh-serving_ckadner/config
~/GolandProjects/modelmesh-serving_ckadner/config/rbac/namespace-scope ~/GolandProjects/modelmesh-serving_ckadner/config
~/GolandProjects/modelmesh-serving_ckadner/config
customresourcedefinition.apiextensions.k8s.io "inferenceservices.serving.kserve.io" deleted
customresourcedefinition.apiextensions.k8s.io "predictors.serving.kserve.io" deleted
customresourcedefinition.apiextensions.k8s.io "servingruntimes.serving.kserve.io" deleted
Error: unknown flag: --load-restrictor
No resources found 3 Modelmesh pods still running: $ kubectl get pods --all-namespaces | grep model
modelmesh-serving etcd-7dbb56b4d9-fxmd6 1/1 Running 0 11d
modelmesh-serving minio-6c55d85f46-kzfnp 1/1 Running 0 11d
modelmesh-serving modelmesh-controller-7f7fb577db-5c8c9 1/1 Running 0 11d Try the delete script again after setting current namespace to $ kubectl config set-context --current --namespace=modelmesh-serving
Context "ckadner-modelmesh-dev/cbu3tugd0r5j02te049g" modified. Run the delete script again: $ ./scripts/delete.sh
current namespace: modelmesh-serving
deleting in namespace: modelmesh-serving
~/GolandProjects/modelmesh-serving_ckadner/config/default ~/GolandProjects/modelmesh-serving_ckadner/config
~/GolandProjects/modelmesh-serving_ckadner/config
~/GolandProjects/modelmesh-serving_ckadner/config/rbac/namespace-scope ~/GolandProjects/modelmesh-serving_ckadner/config
~/GolandProjects/modelmesh-serving_ckadner/config
deleting cluster scope RBAC
serviceaccount "modelmesh" deleted
serviceaccount "modelmesh-controller" deleted
role.rbac.authorization.k8s.io "modelmesh-controller-leader-election-role" deleted
role.rbac.authorization.k8s.io "modelmesh-controller-restricted-scc-role" deleted
clusterrole.rbac.authorization.k8s.io "modelmesh-controller-role" deleted
rolebinding.rbac.authorization.k8s.io "modelmesh-controller-leader-election-rolebinding" deleted
rolebinding.rbac.authorization.k8s.io "modelmesh-controller-restricted-scc-rolebinding" deleted
clusterrolebinding.rbac.authorization.k8s.io "modelmesh-controller-rolebinding" deleted
networkpolicy.networking.k8s.io "modelmesh-controller" deleted
networkpolicy.networking.k8s.io "modelmesh-runtimes" deleted
configmap "model-serving-config-defaults" deleted
deployment.apps "modelmesh-controller" deleted
Error: unknown flag: --load-restrictor
No resources found 2 `modelmesh-serving pods still running: $ kubectl get pods --all-namespaces | grep model
modelmesh-serving etcd-7dbb56b4d9-fxmd6 1/1 Running 0 11d
modelmesh-serving minio-6c55d85f46-kzfnp 1/1 Running 0 11d Run the delete script again with the namespace flag: $ ./scripts/delete.sh --namespace modelmesh-serving
current namespace: modelmesh-serving
Context "ckadner-modelmesh-dev/cbu3tugd0r5j02te049g" modified.
deleting in namespace: modelmesh-serving
~/GolandProjects/modelmesh-serving_ckadner/config/default ~/GolandProjects/modelmesh-serving_ckadner/config
~/GolandProjects/modelmesh-serving_ckadner/config
~/GolandProjects/modelmesh-serving_ckadner/config/rbac/namespace-scope ~/GolandProjects/modelmesh-serving_ckadner/config
~/GolandProjects/modelmesh-serving_ckadner/config
Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io "modelmesh-controller-rolebinding" not found
Error: unknown flag: --load-restrictor
No resources found 2 Modelmesh pods still running: $ kubectl get pods --all-namespaces | grep model
modelmesh-serving etcd-7dbb56b4d9-fxmd6 1/1 Running 0 11d
modelmesh-serving minio-6c55d85f46-kzfnp 1/1 Running 0 11d |
I noticed I was using an old version of I redeployed the Modelmesh Quickstart this time with $ ./scripts/delete.sh
current namespace: modelmesh-serving
deleting in namespace: modelmesh-serving
~/GolandProjects/modelmesh-serving_ckadner/config/default ~/GolandProjects/modelmesh-serving_ckadner/config
~/GolandProjects/modelmesh-serving_ckadner/config
~/GolandProjects/modelmesh-serving_ckadner/config/rbac/namespace-scope ~/GolandProjects/modelmesh-serving_ckadner/config
~/GolandProjects/modelmesh-serving_ckadner/config
deleting cluster scope RBAC
serviceaccount "modelmesh" deleted
serviceaccount "modelmesh-controller" deleted
role.rbac.authorization.k8s.io "modelmesh-controller-leader-election-role" deleted
role.rbac.authorization.k8s.io "modelmesh-controller-restricted-scc-role" deleted
clusterrole.rbac.authorization.k8s.io "modelmesh-controller-role" deleted
rolebinding.rbac.authorization.k8s.io "modelmesh-controller-leader-election-rolebinding" deleted
rolebinding.rbac.authorization.k8s.io "modelmesh-controller-restricted-scc-rolebinding" deleted
clusterrolebinding.rbac.authorization.k8s.io "modelmesh-controller-rolebinding" deleted
networkpolicy.networking.k8s.io "modelmesh-controller" deleted
networkpolicy.networking.k8s.io "modelmesh-runtimes" deleted
customresourcedefinition.apiextensions.k8s.io "inferenceservices.serving.kserve.io" deleted
customresourcedefinition.apiextensions.k8s.io "predictors.serving.kserve.io" deleted
customresourcedefinition.apiextensions.k8s.io "servingruntimes.serving.kserve.io" deleted
configmap "model-serving-config-defaults" deleted
deployment.apps "modelmesh-controller" deleted
service "etcd" deleted
deployment.apps "etcd" deleted
secret "model-serving-etcd" deleted
service "minio" deleted
deployment.apps "minio" deleted
secret "storage-config" deleted All modelmesh pods are gone now: $ kubectl get pods --all-namespaces | grep model
🦗 🦗 🦗 The $ kubectl get svc --all-namespaces | grep model
modelmesh-serving modelmesh-serving ClusterIP None <none> 8033/TCP,8008/TCP,2112/TCP 11d
|
Thanks @ckadner. Re the kustomize thing, Looks like in other places we made it compatible with different versions, I guess it would be good to do that here too. That's interesting that the |
The service currently belongs to the namespace when ModelMesh is in the default cluster scope mode, because the controller deployment can't own resources in other namespaces. While the service could still be owned by the controller deployment in the controller namespace, the existing use of ServiceReconciler.NamespaceOwned,
It will be the case for user namespaces, meaning delete of ModelMesh resources won't delete the service. This particular behavior is independent of the delete script. Additional logic could be added I guess to have the service in the controller namespace owned by controller deployment while services in user namespaces owned by namespace. |
The service should be cleaned up when ModelMesh is installed with |
Ah good point @chinhuang007 my head was still in namespace scope. I guess the main thing probably would be to just make sure the delete script explicitly cleans up everything it should when in cluster mode. E.g. using a selector or something. Probably we should have a minor bug issue for this. |
Motivation
The
delete.sh
cleanup script currently doesn't require a namespace to be provided via the -n option and uses the current kubectl context's namespace otherwise. The$namespace
variable wasn't set in this case however meaning later parts of the script might not work as intended.Modifications
Ensure
$namespace
variable is set correctly either way.Result
delete.sh
script works properly when-n
option isn't used.