Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to delete volumesnapshot if the volumesnapshotclass is deleted #412

Closed
Madhu-1 opened this issue Oct 29, 2020 · 4 comments · Fixed by #423
Closed

Failed to delete volumesnapshot if the volumesnapshotclass is deleted #412

Madhu-1 opened this issue Oct 29, 2020 · 4 comments · Fixed by #423

Comments

@Madhu-1
Copy link
Contributor

Madhu-1 commented Oct 29, 2020

In case of PVC deletion, nothing is blocking the PVC deletion even if the storageclass is deleted. external provisioner will send a request to the CSI driver with volumeID and no secrets will be sent as it cannot get the storageclass name. but in the case of volume snapshot, the volumesnapshot deletion will never complete as the volumesnapshotclass is already deleted. As per the CSI spec the secrets are optional parameters why external snapshotter is not sending requests to the CSI driver for delete snasphot. is this expected behavior?

I1029 10:13:02.926462       1 snapshot_controller.go:308] checkandRemoveSnapshotFinalizersAndCheckandDeleteContent: set DeletionTimeStamp on content [snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4].
I1029 10:13:02.934162       1 snapshot_controller.go:316] checkandRemoveSnapshotFinalizersAndCheckandDeleteContent: Remove Finalizer for VolumeSnapshot[default/rbd-pvc-snapshot]
I1029 10:13:02.934192       1 snapshot_controller.go:901] checkandRemovePVCFinalizer for snapshot [rbd-pvc-snapshot]: snapshot status [&v1beta1.VolumeSnapshotStatus{BoundVolumeSnapshotContentName:(*string)(0xc000285150), CreationTime:(*v1.Time)(0xc0004c2d80), ReadyToUse:(*bool)(0xc0002a8308), RestoreSize:(*resource.Quantity)(0xc00029f300), Error:(*v1beta1.VolumeSnapshotError)(0xc0002851b0)}]
I1029 10:13:02.943296       1 reflector.go:369] github.com/kubernetes-csi/external-snapshotter/client/v3/informers/externalversions/factory.go:117: forcing resync
I1029 10:13:02.943400       1 snapshot_controller_base.go:158] enqueued "default/rbd-pvc-snapshot" for sync
I1029 10:13:02.951105       1 util.go:264] storeObjectUpdate updating snapshot "default/rbd-pvc-snapshot" with version 1567
I1029 10:13:02.951304       1 snapshot_controller.go:1344] Removed protection finalizer from volume snapshot default/rbd-pvc-snapshot
I1029 10:13:02.951460       1 snapshot_controller_base.go:202] syncSnapshotByKey[default/rbd-pvc-snapshot]
I1029 10:13:02.951525       1 snapshot_controller_base.go:205] snapshotWorker: snapshot namespace [default] name [rbd-pvc-snapshot]
I1029 10:13:02.951545       1 snapshot_controller_base.go:328] checkAndUpdateSnapshotClass [rbd-pvc-snapshot]: VolumeSnapshotClassName [csi-rbdplugin-snapclass]
I1029 10:13:02.951555       1 snapshot_controller.go:1176] getSnapshotClass: VolumeSnapshotClassName [csi-rbdplugin-snapclass]
E1029 10:13:02.951573       1 snapshot_controller.go:1180] failed to retrieve snapshot class csi-rbdplugin-snapclass from the informer: "volumesnapshotclass.snapshot.storage.k8s.io \"csi-rbdplugin-snapclass\" not found"
E1029 10:13:02.951597       1 snapshot_controller_base.go:331] checkAndUpdateSnapshotClass failed to getSnapshotClass volumesnapshotclass.snapshot.storage.k8s.io "csi-rbdplugin-snapclass" not found
I1029 10:13:02.951609       1 snapshot_controller.go:721] updateSnapshotStatusWithEvent[default/rbd-pvc-snapshot]
I1029 10:13:02.951618       1 snapshot_controller.go:724] updateSnapshotStatusWithEvent[rbd-pvc-snapshot]: the same error &{2020-10-29 10:11:39 +0000 UTC 0xc0002851d0} is already set
I1029 10:13:02.951694       1 snapshot_controller_base.go:220] Snapshot "default/rbd-pvc-snapshot" is being deleted. SnapshotClass has already been removed
I1029 10:13:02.951715       1 snapshot_controller_base.go:222] Updating snapshot "default/rbd-pvc-snapshot"
I1029 10:13:02.951728       1 snapshot_controller_base.go:358] updateSnapshot "default/rbd-pvc-snapshot"
I1029 10:13:02.951751       1 util.go:264] storeObjectUpdate updating snapshot "default/rbd-pvc-snapshot" with version 1567
I1029 10:13:02.951778       1 snapshot_controller.go:180] synchronizing VolumeSnapshot[default/rbd-pvc-snapshot]: bound to: "snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4", Completed: false
I1029 10:13:02.951792       1 snapshot_controller.go:182] syncSnapshot [default/rbd-pvc-snapshot]: check if we should remove finalizer on snapshot PVC source and remove it if we can
I1029 10:13:02.951813       1 snapshot_controller.go:901] checkandRemovePVCFinalizer for snapshot [rbd-pvc-snapshot]: snapshot status [&v1beta1.VolumeSnapshotStatus{BoundVolumeSnapshotContentName:(*string)(0xc000285150), CreationTime:(*v1.Time)(0xc0004c2d80), ReadyToUse:(*bool)(0xc0002a8308), RestoreSize:(*resource.Quantity)(0xc00029f300), Error:(*v1beta1.VolumeSnapshotError)(0xc0002851b0)}]
I1029 10:13:02.951893       1 snapshot_controller.go:191] syncSnapshot[default/rbd-pvc-snapshot]: check if we should add invalid label on snapshot
I1029 10:13:02.951919       1 snapshot_controller.go:238] processSnapshotWithDeletionTimestamp VolumeSnapshot[default/rbd-pvc-snapshot]: bound to: "snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4", Completed: false
I1029 10:13:02.951936       1 snapshot_controller.go:272] processSnapshotWithDeletionTimestamp[default/rbd-pvc-snapshot]: delete snapshot content and remove finalizer from snapshot if needed
I1029 10:13:02.951951       1 snapshot_controller.go:278] checkandRemoveSnapshotFinalizersAndCheckandDeleteContent VolumeSnapshot[default/rbd-pvc-snapshot]: bound to: "snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4", Completed: false
I1029 10:13:02.951991       1 snapshot_controller.go:796] isVolumeBeingCreatedFromSnapshot: no volume is being created from snapshot default/rbd-pvc-snapshot
I1029 10:13:02.952021       1 snapshot_controller.go:297] checkandRemoveSnapshotFinalizersAndCheckandDeleteContent[default/rbd-pvc-snapshot]: Set VolumeSnapshotBeingDeleted annotation on the content [snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4]
I1029 10:13:02.952036       1 snapshot_controller.go:308] checkandRemoveSnapshotFinalizersAndCheckandDeleteContent: set DeletionTimeStamp on content [snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4].
I1029 10:13:02.964786       1 snapshot_controller.go:316] checkandRemoveSnapshotFinalizersAndCheckandDeleteContent: Remove Finalizer for VolumeSnapshot[default/rbd-pvc-snapshot]
I1029 10:13:02.964811       1 snapshot_controller.go:901] checkandRemovePVCFinalizer for snapshot [rbd-pvc-snapshot]: snapshot status [&v1beta1.VolumeSnapshotStatus{BoundVolumeSnapshotContentName:(*string)(0xc000285150), CreationTime:(*v1.Time)(0xc0004c2d80), ReadyToUse:(*bool)(0xc0002a8308), RestoreSize:(*resource.Quantity)(0xc00029f300), Error:(*v1beta1.VolumeSnapshotError)(0xc0002851b0)}]
I1029 10:13:02.974612       1 util.go:264] storeObjectUpdate updating snapshot "default/rbd-pvc-snapshot" with version 1567
I1029 10:13:02.974651       1 snapshot_controller.go:1344] Removed protection finalizer from volume snapshot default/rbd-pvc-snapshot
```

```
E1029 10:16:55.679819       1 snapshot_controller.go:224] getCSISnapshotInput failed to getClassFromVolumeSnapshot failed to retrieve snapshot class csi-rbdplugin-snapclass from the informer: "volumesnapshotclass.snapshot.storage.k8s.io \"csi-rbdplugin-snapclass\" not found"
E1029 10:16:55.685015       1 goroutinemap.go:150] Operation for "delete-snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4" failed. No retries permitted until 2020-10-29 10:17:27.679951762 +0000 UTC m=+442.637480008 (durationBeforeRetry 32s). Error: "failed to get input parameters to delete snapshot for content snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4: \"failed to retrieve snapshot class csi-rbdplugin-snapclass from the informer: \\\"volumesnapshotclass.snapshot.storage.k8s.io \\\\\\\"csi-rbdplugin-snapclass\\\\\\\" not found\\\"\""
I1029 10:16:55.684904       1 event.go:281] Event(v1.ObjectReference{Kind:"VolumeSnapshotContent", Namespace:"", Name:"snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4", UID:"ffc95922-3b31-494e-a40a-abc11f97c648", APIVersion:"snapshot.storage.k8s.io/v1beta1", ResourceVersion:"1565", FieldPath:""}): type: 'Warning' reason: 'SnapshotDeleteError' Failed to get snapshot class or credentials
E1029 10:17:55.679735       1 snapshot_controller.go:481] failed to retrieve snapshot class csi-rbdplugin-snapclass from the informer: "volumesnapshotclass.snapshot.storage.k8s.io \"csi-rbdplugin-snapclass\" not found"
E1029 10:17:55.679764       1 snapshot_controller.go:224] getCSISnapshotInput failed to getClassFromVolumeSnapshot failed to retrieve snapshot class csi-rbdplugin-snapclass from the informer: "volumesnapshotclass.snapshot.storage.k8s.io \"csi-rbdplugin-snapclass\" not found"
E1029 10:17:55.679819       1 goroutinemap.go:150] Operation for "delete-snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4" failed. No retries permitted until 2020-10-29 10:18:59.679780946 +0000 UTC m=+534.637309142 (durationBeforeRetry 1m4s). Error: "failed to get input parameters to delete snapshot for content snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4: \"failed to retrieve snapshot class csi-rbdplugin-snapclass from the informer: \\\"volumesnapshotclass.snapshot.storage.k8s.io \\\\\\\"csi-rbdplugin-snapclass\\\\\\\" not found\\\"\""
I1029 10:17:55.681554       1 event.go:281] Event(v1.ObjectReference{Kind:"VolumeSnapshotContent", Namespace:"", Name:"snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4", UID:"ffc95922-3b31-494e-a40a-abc11f97c648", APIVersion:"snapshot.storage.k8s.io/v1beta1", ResourceVersion:"1565", FieldPath:""}): type: 'Warning' reason: 'SnapshotDeleteError' Failed to get snapshot class or credentials
E1029 10:19:55.680458       1 snapshot_controller.go:481] failed to retrieve snapshot class csi-rbdplugin-snapclass from the informer: "volumesnapshotclass.snapshot.storage.k8s.io \"csi-rbdplugin-snapclass\" not found"
E1029 10:19:55.680653       1 snapshot_controller.go:224] getCSISnapshotInput failed to getClassFromVolumeSnapshot failed to retrieve snapshot class csi-rbdplugin-snapclass from the informer: "volumesnapshotclass.snapshot.storage.k8s.io \"csi-rbdplugin-snapclass\" not found"
E1029 10:19:55.680896       1 goroutinemap.go:150] Operation for "delete-snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4" failed. No retries permitted until 2020-10-29 10:21:57.68076332 +0000 UTC m=+712.638291701 (durationBeforeRetry 2m2s). Error: "failed to get input parameters to delete snapshot for content snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4: \"failed to retrieve snapshot class csi-rbdplugin-snapclass from the informer: \\\"volumesnapshotclass.snapshot.storage.k8s.io \\\\\\\"csi-rbdplugin-snapclass\\\\\\\" not found\\\"\""
I1029 10:19:55.687959       1 event.go:281] Event(v1.ObjectReference{Kind:"VolumeSnapshotContent", Namespace:"", Name:"snapcontent-f4e2764a-7f9f-4619-a89e-9d7d1318bcd4", UID:"ffc95922-3b31-494e-a40a-abc11f97c648", APIVersion:"snapshot.storage.k8s.io/v1beta1", ResourceVersion:"1565", FieldPath:""}): type: 'Warning' reason: 'SnapshotDeleteError' Failed to get snapshot class or credentials
```

cc @xing-yang 
@xing-yang
Copy link
Collaborator

Which version is this? I thought the problem is already fixed.

@Madhu-1
Copy link
Contributor Author

Madhu-1 commented Oct 30, 2020

I have tested with old version 2.x I think. I verified with the latest master it's sending a request to delete the snapshot to the CSI driver. Thanks @xing-yang

One question, volumesnapshot won't be deleted until the csi snapshot or volumesnapshotcontent is deleted is that correct?
but in the PVC case, kubernetes will delete the PVC object than the external-provisioner will take care of deleting the PV and csi volume.

@xing-yang
Copy link
Collaborator

The Alpha snapshot design was almost completely modeled after the PV/PVC design, but we made enhancements to the snapshot design when moving to Beta. We delete CSI snapshot first, then VolumeSnapshotContent, and finally VolumeSnapshot to avoid leaking resources.

@huffmanca
Copy link
Contributor

We've investigated this a bit further, and here's my summary of what's occurring:

This issue is pertaining to deleting the VolumeSnapshotClass when the driver requires credentials for creation/deletion. In this case, the driver fails to remove the backend snapshot, which prevents the VolumeSnapshotContent being deleted, which prevents the VolumeSnapshot from being deleted.

At this point I think we can fix the issue with the VolumeSnapshotClass being deleted and attempt to include the description in the VolumeSnapshotContent status message.

The relevant sections are below:

  • Here we try to get the class and return a nil credentials if it's not found -
    if className != nil {
    class, err = ctrl.getSnapshotClass(*className)
    if err != nil {
    klog.Errorf("getCSISnapshotInput failed to getClassFromVolumeSnapshot %s", err)
    return nil, nil, err
    }
  • And then we pass these nil credentials into the deletion request -
    _, snapshotterCredentials, err := ctrl.getCSISnapshotInput(content)
    if err != nil && !errors.IsNotFound(err) {
    ctrl.eventRecorder.Event(content, v1.EventTypeWarning, "SnapshotDeleteError", "Failed to get snapshot class or credentials")
    return fmt.Errorf("failed to get input parameters to delete snapshot for content %s: %q", content.Name, err)
    }
    err = ctrl.handler.DeleteSnapshot(content, snapshotterCredentials)
    if err != nil {
    ctrl.eventRecorder.Event(content, v1.EventTypeWarning, "SnapshotDeleteError", "Failed to delete snapshot")
    return fmt.Errorf("failed to delete snapshot %#v, err: %v", content.Name, err)

Since we have nil credentials, the deletion request fails, and then we see this error logged from line 350.

The above is my current understanding of the situation, and I'm going to attempt to submit a PR so that we can get the credentials from the VolumeSnapshotContent's annotations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants