Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when restoring PVC using CSI #7950

Closed
OlivierJavaux opened this issue Jun 28, 2024 · 6 comments
Closed

Error when restoring PVC using CSI #7950

OlivierJavaux opened this issue Jun 28, 2024 · 6 comments

Comments

@OlivierJavaux
Copy link

I upgraded velero to 1.14 which includes now the CSI plugin.
I backup a statefulset with 2 pods and 2 PVC.
I tried to restore it but restore failed for one PVC. Second one was successfully restored.
I did the test twice with the same error.

$ velero restore describe ... --details
...
Restore Item Operations:
  Operation for persistentvolumeclaims test-pvc-a-detruire/data-test-pvc-0:
    Restore Item Action Plugin:  velero.io/csi-pvc-restorer
    Operation ID:                dd-fd63a347-1366-48d7-94a9-b104d6bac355.57a1a263-312a-4612b5eda
    Phase:                       Failed
    Operation Error:             DataDownload is canceled
    Progress description:        Canceled
    Created:                     2024-06-28 10:07:37 +0000 UTC
    Started:                     2024-06-28 10:07:37 +0000 UTC
    Updated:                     2024-06-28 10:07:37 +0000 UTC
  Operation for persistentvolumeclaims test-pvc-a-detruire/data-test-pvc-1:
    Restore Item Action Plugin:  velero.io/csi-pvc-restorer
    Operation ID:                dd-fd63a347-1366-48d7-94a9-b104d6bac355.f62646a5-6492-4b3637e16
    Phase:                       Completed
    Progress:                    8 of 8 complete (Bytes)
    Progress description:        Completed
    Created:                     2024-06-28 10:07:37 +0000 UTC
    Started:                     2024-06-28 10:07:57 +0000 UTC
    Updated:                     2024-06-28 10:08:11 +0000 UTC
...

$ kubectl describe datadownloads.velero.io ...
...
Status:
  Completion Timestamp:  2024-06-28T10:07:37Z
  Message:               found a dataupload velero/velero-test-pvc-20240628095707-20240628100734-x7b7t with expose error: Pod is unschedulable: 0/5 nodes are available: persistentvolumeclaim "velero-test-pvc-20240628095707-20240628100734-x7b7t" not found. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling.. mark it as cancel

But the pod is pending because the PVC is not ready.

$ kubectl get pods -n test-pvc-a-detruire
NAME         READY   STATUS    RESTARTS   AGE
test-pvc-0   0/1     Pending   0          27m
test-pvc-1   1/1     Running   0          27m

I would say that velero does not wait long enough to have the pod ready in order to start the data download process.
Is it possible to configure timeouts? did not find.

Note that backups / restores were working properly with previous release (1.13) and additional CSI plugin.

@sseago
Copy link
Collaborator

sseago commented Jun 28, 2024

Yes, this is a known bug with 1.14.0. The fix is already merged to the release-1.14 branch and will be included in 1.14.1: #7926

@OlivierJavaux
Copy link
Author

#7926 is correcting it.

@SCLogo
Copy link

SCLogo commented Jul 8, 2024

@sseago Do you know any ETA about 1.14.1 release ?

@sseago
Copy link
Collaborator

sseago commented Jul 8, 2024

@SCLogo Tentative plan is early August.

@SCLogo
Copy link

SCLogo commented Jul 8, 2024

Thanks

@vincmarz
Copy link

vincmarz commented Aug 2, 2024

Hi! With velero 1.14 I have the same issue:
velero describe restore book-dev-test2-2024-08-02-restore
Name: book-dev-test2-2024-08-02-restore
Namespace: openshift-adp
Labels:
Annotations:

Phase: PartiallyFailed (run 'velero restore logs book-dev-test2-2024-08-02-restore' for more information)
Total items to be restored: 52
Items restored: 52

Started: 2024-08-02 12:54:07 +0200 CEST
Completed: 2024-08-02 13:04:17 +0200 CEST
[...]
Errors:
Velero: error from restore item operation: DataDownload is canceled
Cluster:
Namespaces:
book-dev: fail to patch dynamic PV, err: context deadline exceeded, PVC: wanjaserver, PV: pvc-a905b9f7-d4f8-4768-ad86-6bfb5e724fdd

I have to use the previous release 1.13 and additional CSI plugin.

Bye

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants