Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proposal for transfer of VolumeSnapshot #2849

Closed

Conversation

Elbehery
Copy link

@Elbehery Elbehery commented Aug 2, 2021

Attempts to continue the work done by @j-griffith in #643 and #1112, @mhenriks in #1555 and #2110, and @huffmanca in #2326 to allow transferring VolumeSnapshots across different namespaces

xref #2848

/sig storage

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 2, 2021
@k8s-ci-robot
Copy link
Contributor

Welcome @Elbehery!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @Elbehery. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 2, 2021
@Elbehery
Copy link
Author

Elbehery commented Aug 2, 2021

/assign @mhenriks
/assign @jsafrane
/assign @mkimuram
/assign @bswartz
/assign @huffmanca

@k8s-ci-robot
Copy link
Contributor

@Elbehery: GitHub didn't allow me to assign the following users: mhenriks.

Note that only kubernetes members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @mhenriks
/assign @jsafrane
/assign @mkimuram
/assign @bswartz
/assign @huffmanca

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Elbehery
Copy link
Author

Elbehery commented Aug 2, 2021

as a reference, design sessions recordings

Also this doc is used to track previous attempts, discussion, and suggestions

@jingxu97
Copy link
Contributor

jingxu97 commented Aug 2, 2021

cc @msau42 @jingxu97

Copy link
Contributor

@mkimuram mkimuram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Elbehery

Thank you for creating this KEP PR.

nitty-gritty.
-->

`VolumeSnapshotTransferRequest` resource:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep the name StorageTransferRequest and StorageTransferAccept or StorageTransferApproval for allowing other resources to be transferred in the future?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're planning to extend the same CRD to PVCs (which could be a good idea), then we would need to have at least a good design how PVCs will be transferred before going beta with snapshots, to make sure the API is correct. This is out of scope of this PR, but worth a note somewhere.

Copy link
Author

@Elbehery Elbehery Aug 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsafrane I think holding on to have a good design which includes PVC has been discussed in the design sessions.
We concluded to move on with VolumeSnapshot without secrets, so that we can have something working and gather feedback.

IMO we should move on with VolumeSnapshot only, and indeed before promoting to BETA we can revisit the design/impl to include PVCs and secrets, please correct me if I am wrong 🙏🏽

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xing-yang @mkimuram +1, fixing it

-->

- Transfer of resources other than VolumeSnapshot will not be discussed.
- Transfer of VolumeSnapshots which contain secret.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should discuss this. Transfer snapshot without secret can be a milestone for alpha, for example. However, I guess that snapshot with secret will need to be supported before GA.

Also, I think that secrets defined without template like $namespace should work well even after transfer in most cases. So, it can be added as a scope for initial implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I think we should have solid design of secrets handling. It does not need to be in the initial implementation, but we don't want to realize the approach we design here is wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make StorageTransferRequest to be a generic interface and to handle resource specific requirement, we will need to add something like spec.options field to allow key/value pairs (map[string]string) to be passed by users. By using this field, users can pass resource specific options, like whether to overwrite secrets, through StorageTransferRequest to the transfer controller of each resource.

spec:
  options:
    secretOverwrite : "false"

And by doing so, we might be able to decide not implementing secretOverwrite = true case for alpha, with a room to provide this feature later without changing the CRD definition.

Also, the options can be resource specific, like below:

spec:
  options:
    snapshotterSecretName : "newSecret"
    snapshotterSecretNamespace : "destinationNamespace"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make StorageTransferRequest to be a generic interface and to handle resource specific requirement, we will need to add something like spec.options field to allow key/value pairs (map[string]string) to be passed by users. By using this field, users can pass resource specific options, like whether to overwrite secrets, through StorageTransferRequest to the transfer controller of each resource.

@mkimuram @jsafrane in the design session we agreed to have a validation webhook to reject the transfer in case it contains secrets

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me explain more details on my intention of my comments here and here.

If snapshotter-secret is defined with a fixed value like below in volumeSnapshotClass:

csi.storage.k8s.io/snapshotter-secret-name: shared-secret
csi.storage.k8s.io/snapshotter-secret- namespace: default

Admin's intention will be that everyone in the cluster is allowed to use the same secret, or shared-secret in default namespace in above example, because volumeSnapshotClass is cluster-scoped, as a result, everyone in the cluster who used the class can/will take a snapshot by using the same secret. Therefore, no one should care if the secret makes issue after transfer.

On the other hand, if snapshotter-secret is defined with a template like below in volumeSnapshotClass:

  • (1) per namespace
csi.storage.k8s.io/snapshotter-secret-name: per-namespace-secret
csi.storage.k8s.io/snapshotter-secret- namespace: ${volumesnapshot.namespace}

or

  • (2) per snapshot
csi.storage.k8s.io/snapshotter-secret-name: ${volumesnapshot.name}
csi.storage.k8s.io/snapshotter-secret- namespace: ${volumesnapshot.namespace}

Admin's intention will be to use a different secret per namespace or per snapshot, because the secret used to take a snapshot will be different per namespace or per snapshot. In this situation, whether to allow using the secret after transfer depends on each transfer. For example, when Alice in namespace ns-a transfers her snapshot-a to Bob in namespace ns-b. She may allow Bob to access her per-namespace-secret or snapshot-a in ns-a, or may not allow it.
(Note that Bob won't be able to directly see Alice's secret, just the driver accesses to it when handling the snapshot).

With above in mind, we can say:

  • (a) No secret case: No need to care about secrets
  • (b) Secret with fixed value case: No need to care about secrets after transfer make issue
  • (c) Secret with templated value case ((1) and (2) above): It depends. Keeping secret as is maybe OK, if the user agrees. But if not, it needs to be changed somehow.

Therefore, I'm suggesting not to only focus on (a), but to include (b) as a scope.
And if possible, I'm not saying that it is mandatory, we should consider (c) on how it could be handled later. At least, we can define an option to specify whether to allow keeping secret as is or not, and transferring snapshot with keeping secret as is (And we can make default to be allow and leave disallow case not implemented for alpha).

P.S.
After I wrote above, I started to think whether updating secret after transfer, by using the template makes sense in case (1).
There should be another secret with the same name and different contents in the destination namespace. And even if not, sharing the contents of the secret in the source namespace to the destination namespace won't be a good idea, because it should be a "secret" for the particular namespace (There will be more security risk by sharing the contents than sharing the reference). Although, this idea should be still valid for case (2).

Maybe, if users really care about sharing secrets, they will need to take a snapshot with snapshot class with per snapshot secret, and transfer the snapshot after sharing the secret in some ways (And this idea would be hard to apply to PVC, but it will be out of scope for now).


Based on community design sessions, and inspired by [API for PVC Namespace Transfer](https://github.com/kubernetes/enhancements/pull/2110/files), the following is being proposed :-

- `VolumeSnapshotTransferRequest` resources are created in the target namespace.
Copy link
Contributor

@mkimuram mkimuram Aug 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is my understanding right that there are changes in the model who create VolumeSnapshotTransferRequest and VolumeSnapshotTransferApproval in which namespace?

Current model:

  1. VolumeSnapshotTransferRequest is created in target-namespace where the snapshot should be transferred by a user who doesn't own the snapshot currently.
  2. VolumeSnapshotTransferApproval is created in source-namespace where the snapshot currently exists by an owner of the snapshot.

Previous model:

  1. StorageTransferRequest is created in source-namespace where the snapshot currently exists by an owner of the snapshot (Also see here).
  2. StorageTransferAccept is created in target-namespace where the snapshot should be transferred by a user who doesn't own the snapshot currently.

I prefer the current model, because these changes with other modifications will provide a guarantee:

  • 1:1 mapping of VolumeSnapshotTransferRequest and VolumeSnapshotTransferApproval (No other approvals in other namespaces)
  • 1:1 mapping of VolumeSnapshotTransferRequest and target snapshot (No other snapshots in other namespaces)

We still need to handle other corner cases like multiple VolumeSnapshotTransferRequests for one source snapshot, but it seems a big improvement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly prefer VolumeSnapshotTransferRequest is created in the source namespace too. It could be easier to control that only one transfer of a snapshot happens at a time. For example by requiring VolumeSnapshot.Name == VolumeSnapshotTransferRequest.Name

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, sorry for the confusion its my mistake 🙏🏽

source:
name: foo
namespace: source-namespace
kind: VolumeSnapshot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we rename to VolumeSnapshotTransferRequest, spec.source.kind won't be necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkimuram I agree it will not be necessary, but I prefer to keep kind to follow VolumeSnapshot specs.
afaik, VolumeSnapshot could have VolumeSnapshotContent, or PersistentVolumeClaim as source, and kind is determined based on that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also renamed the CRDs as per your review :)

spec:
source:
name: foo
kind: VolumeSnapshot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we rename to VolumeSnapshotTransferApproval, spec.source.kind won't be necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also renamed the CRDs as per your review :)

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Elbehery
To complete the pull request process, please ask for approval from jsafrane after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

@mkimuram mkimuram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also added comments for Design Details.


- Creates a VolumeSnapshot in the target namespace.
- Binds the VolumeSnapshotContent to the newly created VolumeSnapshot.
- Deletes the source VolumeSnapshot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add descriptions on how to delete source VolumeSnapshot object without deleting VolumeSnapshotContent that was previously bounded to the VolumeSnapshot.

There will be two implementation ideas:

  1. Set DeletionPolicy to VolumeSnapshotContent and restore it after the source VolumeSnapshot is deleted
    In this case, we also need to consider where to store the original DeletionPolicy to restore. As for kubevirt implementation, it is stored in OjbectTransfer CRD.

  2. Unbind from the source VolumeSnapshot before deleting it, as it is done for PVC in my PoC, here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't be needed if we choose an approach to create another VolumeSnapshotContent as @xing-yang explained. For the approach, we still need to be careful that two VolumeSnapshotContents are referencing a single actual backend volume before deleting the old one. Also, we will need to be careful not to delete the actual volume when deleting either of the VolumeSnapshotContents.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated it above, and using new VolumeSnapshotContent

Once matching is found, the transfer process is initiated by the controller, and it performs the following:

- Creates a VolumeSnapshot in the target namespace.
- Binds the VolumeSnapshotContent to the newly created VolumeSnapshot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add descriptions on how to avoid the VolumeSnapshot and VolumeSnapshotContent to be used while transferring,
For example, "creating volume from snapshot" may be requested to the source VolumeSnapshot while it is transferring, and the VolumeSnapshotContent may be deleted through the target VolumeSnapshot while creating the volume from the snapshot is still executing.

There will be two implementation ideas:

  1. Delete the VolumeSnapshot before actual transferring starts
    Please also see kubevirt implementation.

  2. Block such actions through VolumeSnapshot's status
    Codes like this can be added to avoid such situations if it has status.
    Also, VolumeSnapshotStatus seems to have ReadyToUse field. We might be able to re-use it for this purpose.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, fixed

- Creates a VolumeSnapshot in the target namespace.
- Binds the VolumeSnapshotContent to the newly created VolumeSnapshot.
- Deletes the source VolumeSnapshot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add descriptions on how to avoid issues caused by multiple VolumeSnapshotTransferRequests for one source VolumeSnapshot. Unless transfer operations are atomic, two transition states for different VolumeSnapshotTransferRequests will make undesirable state.

Adding status fields to store information on which VolumeSnapshotTransferRequest is handled currently will help block other request to be processed at the same time.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we use what @jsafrane suggested here #2849 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we use what @jsafrane suggested here #2849 (comment)

It would be the one way to reduce the risk and it will be good to change that way, but it won't prevent this issue to happen completely. The other user in the namespace still might create a request for the same volumeSnapshot.

One of the ideas is to make StorageTransferRequest has a status pending and running, and before the controller change it to running, the controller should check its target VolumeSnapshot has some kind of mark to show that volumeSnapshot is transferring. And only if the volumeSnapshot is not transferring, the controller should set the mark to it and make StorageTransferRequest running.

I think that the mark will be finalizer and/or annotation(s) that have name and namespace of the StorageTransferRequest, which should help handle the canceling path that I mentioned in other comments.

- Creates a VolumeSnapshot in the target namespace.
- Binds the VolumeSnapshotContent to the newly created VolumeSnapshot.
- Deletes the source VolumeSnapshot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add descriptions on what happens if VolumeSnapshotTransferRequest and/or VolumeSnapshotTransferApproval is deleted, which would be expected to cancel the operation. To properly handle this, both resources should have status like pending and running, at least. They need finalizer to roll back other resources' status if they are marked as running.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkimuram would you please elaborate more on this ?

You mean if the CRDs themselves are deleted ?

Copy link
Contributor

@mkimuram mkimuram Aug 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean if the CRDs themselves are deleted ?

Yes. The worst case scenario will be that the controller get error for some reason at the very last step, and before retrying the step, the CRD for it is deleted. When it happens, for example, there will be two volumeSnapshotContents, which may both have retain policy, for the same backend volume. Even in such cases, it should be able to rollback to some desirable state, which I believe @bswartz mentioned in other place.

@xing-yang
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 4, 2021
-->
This KEP proposes a method for transferring VolumeSnapshot between namespaces.

Currently, VolumeSnapshot, and VolumeSnapshotContent are bound to specific namespace. The binding relationship between these two storage resources is one-to-one.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VolumeSnapshotContent is cluster scoped, not namespace scoped.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xing-yang thanks, i missed this :)

### Goals
- Use CRDs for VolumeSnapshot Namespace Transfer request/approval.
- Edit external SnapshotController to accept/reject/init VolumeSnapshot transfer.
- The source and target VolumeSnapshot will point to the same VolumeSnapshotContent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't happen as VolumeSnapshot and VolumeSnapshotContent have 1:1 mapping. Once they are bound, you can't change it. This is by design. You'll have to create a new VolumeSnapshotContent for the transfer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

and make progress.
-->

- Transfer of resources other than VolumeSnapshot will not be discussed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add that PVC transfer may be discussed in the future.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

nitty-gritty.
-->

`VolumeSnapshotTransferRequest` resource:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

requestName: transfer-foo
```

When matching resources are detected, the controller initiate the transfer process.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: initiate -> initiates

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏽 🙏🏽

This KEP proposes the minimal to transfer VolumeSnapshot between namespaces, while avoiding breaking the current API.
Many caveats have been discussed, including but not limited to :-

- [StorageClass Secrets](https://kubernetes-csi.github.io/docs/secrets-and-credentials-storage-class.html).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain more why this is a problem?

Many caveats have been discussed, including but not limited to :-

- [StorageClass Secrets](https://kubernetes-csi.github.io/docs/secrets-and-credentials-storage-class.html).
- VolumeSnapshotContent.Spec.VolumeSnapshotRef is `immutable`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be a problem. Just need to create a new VolumeSnapshotContent pointing to the same snapshot handle.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xing-yang removed 🙏🏽

Once matching is found, the transfer process is initiated by the controller, and it performs the following:

- Creates a VolumeSnapshot in the target namespace.
- Binds the VolumeSnapshotContent to the newly created VolumeSnapshot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned earlier, a new VolumeSnapshotContent needs to be created.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

# The most recent milestone for which work toward delivery of this KEP has been
# done. This can be the current (upcoming) milestone, if it is being actively
# worked on.
latest-milestone: "v1.22"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be 1.23 as this work won't make it into 1.22.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

# The following PRR answers are required at alpha release
# List the feature gate name and the components for which it must be enabled
feature-gates:
- name: VolumeSnapshotNamespaceTransfer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we discussed during the design meeting that the implementation will happen out-of-tree. In this case, we won't need an in-tree feature gate. Maybe we can consider an out-of-tree feature gate.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@xing-yang
Copy link
Contributor

/assign @yuxiangqian

@Elbehery Elbehery force-pushed the transfer_volumesnapshot_namespace branch 2 times, most recently from 9217b42 to b782070 Compare August 11, 2021 15:45
feature-gates:
- name: VolumeSnapshotNamespaceTransfer
components:
- kube-apiserver (assuming we introduce the StorageTransfer* resources)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove kube-apiserver and clarify that this is an out-of-tree feature gate for external-snapshotter

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 🙏🏽

@Elbehery Elbehery force-pushed the transfer_volumesnapshot_namespace branch from b782070 to b2db044 Compare August 11, 2021 15:55
@mkimuram
Copy link
Contributor

mkimuram commented Nov 2, 2021

I've implemented a minimal prototype of this comment, here.

As a minimal prototype, it is directly implemented in external-provisioner and VolumeSnapshotLink CRD also exists in the same repo, however we will be able to implement the other way, like implementing it as a volume-populator and moving VolumeSnapshotLink CRD in external-snapshotter repo. Let's just use this for further discussions.

As a result of the test by the below way, it seems working well at least for creating PVC from VolumeSnapshotLink use case.

  1. Deploy k8s with AnyVolumeDataSource feature gate enabled
  • ex) In k/k directory:
FEATURE_GATES=AnyVolumeDataSource=true hack/local-up-cluster.sh
  1. Deploy volumeSnapshot controller and csi-hostpath driver
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml

kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml

cd /tmp
git clone --depth=1 https://github.com/kubernetes-csi/csi-driver-host-path.git
cd csi-driver-host-path
./deploy/kubernetes-latest/deploy.sh
  1. Prepare for using this feature
  • Deploy ReferencePolicy CRD and VolumeSnapshotLink CRD and update RBAC for them
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/master/config/crd/v1alpha2/gateway.networking.k8s.io_referencepolicies.yaml
curl -L https://raw.githubusercontent.com/mkimuram/external-provisioner/transfer-by-reference/client/config/crd/transfer.storage.k8s.io_volumesnapshotlinks.yaml | sed 's|controller-gen.kubebuilder.io/version: v0.7.0|api-approved.kubernetes.io: https://github.com/kubernetes/enhancements/pull/2849|' | kubectl apply -f -

kubectl apply -f https://raw.githubusercontent.com/mkimuram/external-provisioner/transfer-by-reference/deploy/kubernetes/rbac.yaml
  • Build external-provisioner with this feature and update the image for csi-hostpathplugin
cd /tmp
git clone --depth=1 -b transfer-by-reference  https://github.com/mkimuram/external-provisioner.git
cd external-provisioner
make container-csi-provisioner
kubectl set image statefulset csi-hostpathplugin csi-provisioner=csi-provisioner:latest
  1. Create StorageClass, PVC, and VolumeSnapshot by the examples in the csi-hostpath repo

Note that touching /var/lib/csi-hostpath-data/${volumeHandle}/dummydata for the original PV is done, before taking snapshot. So, this file should be included in the volume restored from the snapshot.

cd /tmp/csi-driver-host-path
kubectl apply -f examples/csi-storageclass.yaml
kubectl apply -f examples/csi-pvc.yaml

kubectl get pvc,pv
NAME                            STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
persistentvolumeclaim/csi-pvc   Bound    pvc-43b57144-0666-41f0-9554-cff9a42daeb5   1Gi        RWO            csi-hostpath-sc   5s

NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM             STORAGECLASS      REASON   AGE
persistentvolume/pvc-43b57144-0666-41f0-9554-cff9a42daeb5   1Gi        RWO            Delete           Bound    default/csi-pvc   csi-hostpath-sc            4s


touch /var/lib/csi-hostpath-data/$(kubectl get pv $(kubectl get pvc csi-pvc -o jsonpath='{.spec.volumeName}') -o jsonpath='{.spec.csi.volumeHandle}')/dummydata


kubectl apply -f examples/csi-snapshot-v1beta1.yaml

kubectl get volumesnapshot,volumesnapshotcontent
NAME                                                       READYTOUSE   SOURCEPVC   SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS            SNAPSHOTCONTENT                                    CREATIONTIME   AGE
volumesnapshot.snapshot.storage.k8s.io/new-snapshot-demo   true         csi-pvc                             1Gi           csi-hostpath-snapclass   snapcontent-20514895-2c63-4f1c-8f43-990208ae8f87   5s             5s

NAME                                                                                             READYTOUSE   RESTORESIZE   DELETIONPOLICY   DRIVER                VOLUMESNAPSHOTCLASS      VOLUMESNAPSHOT      VOLUMESNAPSHOTNAMESPACE   AGE
volumesnapshotcontent.snapshot.storage.k8s.io/snapcontent-20514895-2c63-4f1c-8f43-990208ae8f87   true         1073741824    Delete           hostpath.csi.k8s.io   csi-hostpath-snapclass   new-snapshot-demo   default                   5s
  1. Test creating PVC in ns1 namespace from VolumeSnapshot in default namespace
  • Create ReferencePolicy, VolumeSnapshotLink, and PersistentVolumeClaim
kubectl create ns ns1

cat << EOF | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: ReferencePolicy
metadata:
  name: bar
  namespace: default
spec:
  from:
  - group: transfer.storage.k8s.io
    kind: VolumeSnapshotLink
    namespace: ns1
  to:
  - group: snapshot.storage.k8s.io
    kind: VolumeSnapshot
    name: new-snapshot-demo
EOF

cat << EOF | kubectl apply -f -
apiVersion: transfer.storage.k8s.io/v1alpha1
kind: VolumeSnapshotLink
metadata:
  name: foolink
  namespace: ns1
spec:
  source:
    name: new-snapshot-demo
    namespace: default
EOF

cat << EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: foo-pvc
  namespace: ns1
spec:
  storageClassName: csi-hostpath-sc
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  dataSourceRef:
    apiGroup: transfer.storage.k8s.io
    kind: VolumeSnapshotLink
    name: foolink
  volumeMode: Filesystem
EOF
  • Confirm that the PVC and PV are created from the volumeSnapshot
kubectl get pvc,pv -n ns1 
NAME                            STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
persistentvolumeclaim/foo-pvc   Bound    pvc-bbd237a9-4d92-4b10-b597-7154909e88a4   1Gi        RWO            csi-hostpath-sc   2m33s

NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM             STORAGECLASS      REASON   AGE
persistentvolume/pvc-43b57144-0666-41f0-9554-cff9a42daeb5   1Gi        RWO            Delete           Bound    default/csi-pvc   csi-hostpath-sc            3m51s
persistentvolume/pvc-bbd237a9-4d92-4b10-b597-7154909e88a4   1Gi        RWO            Delete           Bound    ns1/foo-pvc       csi-hostpath-sc            26s
  • Confirm that that the PV contains the dummydata
ls -l /var/lib/csi-hostpath-data/$(kubectl get pv $(kubectl get pvc foo-pvc -n ns1 -o jsonpath='{.spec.volumeName}') -o jsonpath='{.spec.csi.volumeHandle}')
-rw-r--r--. 1 root root 0 Nov  2 21:25 dummydata

@Elbehery
Copy link
Author

Elbehery commented Nov 4, 2021

@mkimuram Thanks a lot for your proposal and PoC. I cant find the words.

Shall we discuss this in today's meeting ?

cc @xing-yang @jsafrane

@xing-yang
Copy link
Contributor

  1. Also add AnyVolumeDataSource feature to snapshot feature (allow creating VolumeSnapshot from
    VolumeSnapshotLink and create backup from it)

When creating a PVC from a data source, a new volume will be created on the storage system. When creating a VolumeSnapshot from a data source, it still points to the same snapshot on the storage system. These have different meanings and will add confusion to the API definitions.

@mkimuram
Copy link
Contributor

mkimuram commented Nov 4, 2021

@bswartz As for encryption, CSI spec doesn't seem to define anything about it. However, many CSI drivers implement encryption by their own ways (For example, Ceph, Portworx, and GKE).

So, whether this transfer implementation of handling secret affects or not will depend on how they implement their encryption. In general, we would be able to say that:

  • If CSI driver just uses snapshotter-secret for "creation" and "deletion" as described here, not touching snapshotter-secret won't affect (Unless one insists that creation includes creation and encryption),
  • If the same encryption secret is shared within a cluster or within a StorageClass or a VolumeSnapshotClass, encryption secrets won't affect.

However, if Secrets like snapshoter-secret and provisioner-secret are also used for encryption and the Secrets are per namespace or per volume/volumeSnapshot, it may cause issues (And it is highly dependent on their implementation and would cause issues regardless of transfer implementation, especially when it is encrypted by Secrets defined in StorageClass and not decrypted on taking snapshot).

@bswartz
Copy link
Contributor

bswartz commented Nov 5, 2021

@bswartz As for encryption, CSI spec doesn't seem to define anything about it. However, many CSI drivers implement encryption by their own ways (For example, Ceph, Portworx, and GKE).

So, whether this transfer implementation of handling secret affects or not will depend on how they implement their encryption. In general, we would be able to say that:

* If CSI driver just uses `snapshotter-secret` for "creation" and "deletion" as described [here](https://kubernetes-csi.github.io/docs/secrets-and-credentials-volume-snapshot-class.html#createdelete-volumesnapshot-secret), not touching `snapshotter-secret` won't affect (Unless one insists that creation includes creation and encryption),

* If the same encryption secret is shared within a cluster or within a StorageClass or a VolumeSnapshotClass, encryption secrets won't affect.

However, if Secrets like snapshoter-secret and provisioner-secret are also used for encryption and the Secrets are per namespace or per volume/volumeSnapshot, it may cause issues (And it is highly dependent on their implementation and would cause issues regardless of transfer implementation, especially when it is encrypted by Secrets defined in StorageClass and not decrypted on taking snapshot).

Yeah we can't know what specific drivers are doing with secrets, but what we do know is that the secrets that get used CAN be tied to the namespace of the PVC or VolumeSnapshot object, and thus its very easy to build a system that only works if PVCs and VolumeSnapshots never change namespaces. The moment we introduce a different system where the namespace can change, it's impossible to know what might break, but it seems clear that things can and will break.

The choices we have for this design seem to be:

  1. Ignore this problem and only officially support the mechanism for drivers that don't rely on secrets. It would be up to individual vendors to design their secret-management regimes to support namespaces transfers.
  2. Try to solve the problem internally to our implementation such that any needed secrets are just pulled from correct namespace. My worry with this is that it could open up security holes depending on what an individual CSI driver was using the secrets for.
  3. Try to extend the design to explicitly specify if any dependency secrets should be copied or moved along with the snapshot, so admins would have additional knobs to control this behavior. This would make the feature more complex, and I'm still not convinced it's powerful enough to address all cases securely.

In case it's not clear -- I think the nightmare case (from the perspective of this design) is a CSI driver that explicitly requires that the same secret template be used for both snapshot and volume creation (we don't care why, just that it does), and the template contains a namespace component, such that each namespace has a distinct secret. If we could make a design that addresses this case without creates security holes, then I think we'd be okay.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 3, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 5, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Elbehery
Copy link
Author

Elbehery commented Apr 8, 2022

/reopen

@k8s-ci-robot k8s-ci-robot reopened this Apr 8, 2022
@k8s-ci-robot
Copy link
Contributor

@Elbehery: Reopened this PR.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

@Elbehery: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-enhancements-verify 864529f link true /test pull-enhancements-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mkimuram
Copy link
Contributor

mkimuram commented May 4, 2022

I've created another KEP based on the populator approach, here. Please give feedback to the KEP.

@xing-yang
Copy link
Contributor

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label May 13, 2022
@xing-yang
Copy link
Contributor

/milestone v1.25

@k8s-ci-robot k8s-ci-robot added this to the v1.25 milestone May 13, 2022
@xing-yang
Copy link
Contributor

I've created another KEP based on the populator approach, #3295. Please give feedback to the KEP.

Thanks @mkimuram. Just want to add a note that if we decide to move forward with #3295, we don't need this KEP any more.

@xing-yang xing-yang removed this from the v1.25 milestone May 29, 2022
@Elbehery
Copy link
Author

Elbehery commented Jun 9, 2022

I've created another KEP based on the populator approach, #3295. Please give feedback to the KEP.

Thanks @mkimuram. Just want to add a note that if we decide to move forward with #3295, we don't need this KEP any more.

@xing-yang Hello ✋🏽 yes we can close this in favour of #3295

@mkimuram
Copy link
Contributor

/close

Closing in favor of #3295 .

@k8s-ci-robot
Copy link
Contributor

@mkimuram: Closed this PR.

In response to this:

/close

Closing in favor of #3295 .

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.