Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dataSource from PVC on backup for prior CSI restore case #6111

Merged
merged 1 commit into from
May 18, 2023

Conversation

eemcmullan
Copy link
Contributor

@eemcmullan eemcmullan commented Apr 11, 2023

Thank you for contributing to Velero!

Please add a summary of your change

Removes the dataSource field from PVCs on backup to resolve issue #6106

Does your change fix a particular issue?

Fixes #(issue)
#6106

Please indicate you've done the following:

  • [ x] Accepted the DCO. Commits without the DCO will delay acceptance.
  • [x ] Created a changelog file or added /kind changelog-not-required as a comment on this pull request.
  • Updated the corresponding documentation in site/content/docs/main.

@eemcmullan eemcmullan changed the title Remove dataSource from PVC if had prior CSI restore Remove dataSource from PVC on backup for prior CSI restore case Apr 11, 2023
kaovilai
kaovilai previously approved these changes Apr 11, 2023
Copy link
Member

@kaovilai kaovilai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@blackpiglet
Copy link
Contributor

Please also add a changelog file.

@@ -61,5 +62,19 @@ func (a *PVCAction) Execute(item runtime.Unstructured, backup *v1.Backup) (runti
GroupResource: kuberesource.PersistentVolumes,
Name: pvc.Spec.VolumeName,
}
return item, []velero.ResourceIdentifier{pv}, nil
// remove dataSource if exists from prior restored CSI volumes
if pvc.Spec.DataSource != nil {
Copy link
Contributor

@Lyndon-Li Lyndon-Li Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we move the same to restore, specifically:
During restore, we judge if the DataSource/DataSourceRef is required (for CSI restore, it is required, for others it is not), if it is required, leave the fields there, otherwise, remove them as what we are doing here

Reasons:

  1. In backup, we may have multiple BIAs, a BIA doesn't know what others is doing or will do, so if not necessary, a BIA should not change the data comparing to the source, otherwise, we don't know what other BIAs will react, especially, DataSource/DataSourceRef are used by various data mover purposes
  2. As the principle, if not necessary, the backup procedure should not change anything comparing to the source. This principle guarantees the consistency and coherence of the restore data and finally guarantees the validation of the restore
  3. The current problem is right indicating the requirement that we need to take care of the DataSource/DataSourceRef in restore, just like what we do for Status fields. Therefore, it looks more appropriate to put it into restore

On the other hand, I am not seeing any direct error the current change will cause, so I will leave this to more members for judgements, @reasonerjt @blackpiglet @shubham-pampattiwar

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the resources unchanged in backup is reasonable, but I also couldn't spot any scenarios broken by the current change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lyndon-Li the problem with clearing it on restore is that restore is where some plugins might add it. For example, CSI restore plugin adds it, so if the built-in RIA clears it, we could have a race condition, and there's no way of knowing within one plugin action whether another action has been called and/or will be called. Are there any use cases where having this in backup is needed? This field is only used for initial content creation for a new PVC. This source provided the content for the PVC at creation time. It will not have any content that was added/modified after creation -- for that we need the current content, via kopia/restic, CSI snapshot, or native snapshot. So I'm not sure we ever need/want this in the backup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sseago Yes, indeed, if we want to put the clear code in restore, we need to make sure that the fields are not added by a CSI RIA plugin. One idea is we put the clear code at early stage of restoreItem, e.g., somewhere around here, none of the RIAs has been called at this place, so it is safe to remove the fields without a check.

Are there any use cases where having this in backup is needed?

Personally, I don't see any need for them during in backup at present, so technically, I don't see any error will happen if we keep the current clear code in backup.
Merely, I just think it is more appropriate to have the code in restore (as the reasons mentioned above). Therefore, I raise this as a suggestion, I am fine with either way as the final decision.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it were in the core restore logic rather than in a RestoreItemAction, it would work as long as it was processed before any plugins run. However, the restore workflow is already full of if blocks ("if groupResource ==", etc.) and this would add to the complexity, while doing it on backup as in this PR is putting it in code already filtered for PVCs. Also, if we're clearing the field before running any plugins, then it means the field from backup will never be used anyway, so we might as well remove it on backup. In general, we don't want to change much data on backup, as for most use cases, restore is more appropriate, but in cases where a field will never be useful for restore (or for RIA funcs), then removing on backup is fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as for most use cases, restore is more appropriate, but in cases where a field will never be useful for restore (or for RIA funcs), then removing on backup is fine

OK, this sounds reasonable

Comment on lines +66 to +71
if pvc.Spec.DataSource != nil {
pvc.Spec.DataSource = nil
}
if pvc.Spec.DataSourceRef != nil {
pvc.Spec.DataSourceRef = nil
}
Copy link
Member

@kaovilai kaovilai Apr 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Velero add label to items it restored

labels[velerov1api.BackupNameLabel] = label.GetValidName(backupName)

Do we wanna limit this action to items we are certain are restored by velero?
Or perhaps limit to just those restored by velero-csi-plugin?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think prior restore is necessarily relevant. While the specific failure case we've seen involves prior restore, I think these fields would almost always result in restoring a PVC with stale or invalid content.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok then retitle the PR to a broader scope, then suggest CSI restore as one of the failure case.

@reasonerjt reasonerjt merged commit b891074 into vmware-tanzu:main May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants