Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP Proposal: API for PVC Namespace Transfer #1555

Closed
wants to merge 3 commits into from

Conversation

mhenriks
Copy link

Signed-off-by: Michael Henriksen [email protected]

This follows on previous works by @j-griffith here and here

For now, let's focus on storage resources and specifically PVCs.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 17, 2020
@k8s-ci-robot
Copy link
Contributor

Welcome @mhenriks!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @mhenriks. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 17, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mhenriks
To complete the pull request process, please assign childsb
You can assign the PR to them by writing /assign @childsb in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/storage Categorizes an issue or PR as relevant to SIG Storage. labels Feb 17, 2020

- Make sure the associated PersistentVolume is not deleted/recycled during the transfer by setting reclaim policy to `Retain` (if not already).
- Delete the PVC `source-namespace\foo`.
- Bind The PersistentVolume to `target-namespace\bar`.
Copy link

@tedyu tedyu Feb 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier, targetName (with value of bar) is said to be optional.
If the value is omitted, the transfer would use existing name from the source.

It is better to explain this in the procedure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd prefer not to be implicit, that's one of the reasons I proposed using the datasource model, we can easily enforce things like quota, limits and required claim parameters that way IMO.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, with the datasource model, the user will directly see if creating the target PVC fails because of a quota limitation. Otherwise, it will happen in the background and be reported in the status of the StorageTransferRequest.

What other required parameters are a concern? And does validating the new target PVC require referencing the source PVC to validate compatible parameters? If so, that would be complicated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main philosophy has been just to make it as close to the existing create workflow as possible and avoid assumptions. It also means you don't have to duplicate parameters or testing for them. Instead you use the existing create implementation as your foundation and build on it rather than trying to make it better or offering a divergent path. Just my 2 cents though.


// TargetName allows for the source and target resources to have different names
// It is optional
TargetName *string
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be better to always keep this field populated (which would simplify checking in the code)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tedyu - thanks for taking a look. Yes, your suggestion would definitely make the implementation simpler. But I wonder about the case in which the user doesn't have access to the source namespace and some coordination with an admin is necessary. Now one more piece of information is required to be communicated. Maybe it's not a big deal, but at this point I kind of feel like the benefit outweighs the cost.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the transfer is secure, I wonder under what scenario the user doesn't have access to the source namespace.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A hypothetical case could be something like: Department A needs a copy of a database (PVC) owned by department B. Each department operates only in their own namespace. The admins for A and B can work together to facilitate the transfer rather than getting a cluster admin involved.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense to assume that a typical user of this API has access to both namespaces. If so, binding requests/responses is simpler. Dynamic binding can be a later feature.


```golang
type StorageTransferRequestStatus struct {
Complete bool
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change the bool field to int field where 100 (percent) means completion ?

  Progress int

This way, if the controller can estimate the progress, user would get better idea.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you suggest measuring progress? No data is transferred. The transfer simply involves updating metadata and should occur quickly after requests/approvals are matched.

Copy link

@tedyu tedyu Feb 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to use enum to represent each step as outlined starting line 147 above. Actually this field can be a string where 'done' or 'complete' marks the end of operation.
It is clearer than a single bool.


I believe the API can be implemented as CRDs and an external controller. If so, is a KEP really required? Would there be advantages to implementing in-tree?

What is the best way to deal with unbound PVCs (WaitForFirstConsumer)? Transfer the unbound PVC? Wait until it's bound? May be race conditions with the former.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option is to report error for unbound PVCs.
When the PVC is bound, the transfer process can start again.

to other resource types as compelling use cases come up.

`StorageTransferRequest` resources are created in the target namespace.
`StorageTransferApproval` resources are created in the source namespace.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kubernetes/sig-auth-api-reviews FYI.

@k8s-ci-robot k8s-ci-robot added sig/auth Categorizes an issue or PR as relevant to SIG Auth. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API labels Feb 22, 2020
@sftim
Copy link
Contributor

sftim commented Feb 22, 2020

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 22, 2020
@@ -0,0 +1,283 @@
---
title: Namespace Transfer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: Namespace Transfer
title: Namespace transfer for storage

@fejta-bot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

Signed-off-by: Michael Henriksen <[email protected]>
Signed-off-by: Michael Henriksen <[email protected]>
Copy link
Contributor

@j-griffith j-griffith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @mhenriks I had a couple of questions and a thought around tieing this together with DataSources; would be curious what you think of the idea.

When a controller detects matching request/approval resources, a transfer in initiated.

When transferring a PersistentVolumeClaim, the associated PersistentVolume will be updated,
the source PersistentVolumeClaim will be deleted, and the target PersistentVolumeClaim will be created.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I thought about this the last time around, the more I liked the idea of deleting the original claim; I think that's the right step FWIW.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In another comment here, you propose having the triggering the transfer via the pvc spec.dataSource field. In that case do you think it would still be preferable to delete the original claim? And if so, when? When requests/approvals are matched or the target PVC is created?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that still aligns IMO, once the transfer process is completed then you'd delete the claim from the original namespace.

kind: PersistentVolumeClaim
targetName: bar #this is optional
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that it's necessary, but do we want to consider requiring a secret associated with this? It may be overkill because it's an active operation on both sides and it will only transfer to the destination specified by the source; but just something to think about (briefly). Maybe that's an implementation detail and we address it later.

```

When matching `StorageTransferRequest` and `StorageTransferApproval` resources are detected, the transfer process is initiated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I thought might improve this, rather than immediately taking the action on this approval object, would it make sense to leverage the pvc datasource model that we've been using for other things? So for example what I'm thinking is that to import or accept a transfer the new namespace still issues a new PVC request, but they use a StorageTransferApproval object like you have here as the DataSource.

So the controller would wait for not only the matching requests, but the final even to kick this off would be a create pvc from that transfer datasource. Maybe that's more of an implementation detail.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The StorageTransferApproval is not in the same namespace as the target PVC, so the DataSource would have to refer to the StorageTransferRequest.

I agree that this would make the controller simpler. But decided against it because:

  1. its one more thing for the user to do. And what if the target spec doesn't match the source?
  2. the DataSource field is not appropriately ignored/handled for all provisioners at current.

2 is obviously also an issue for the populators, so maybe shouldn't be too concerned about it now? But it could lead to some undesirable behavior in the meantime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I often reverse who does the request and who does the accept... As far as the other points, I see both sides (mine and yours). Would be curious to see if anybody else has thoughts, I'm open to your ideas. Just want to make sure usability is good and if there's places to leverage existing work we do so.


- Make sure the associated PersistentVolume is not deleted/recycled during the transfer by setting reclaim policy to `Retain` (if not already).
- Delete the PVC `source-namespace\foo`.
- Bind The PersistentVolume to `target-namespace\bar`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd prefer not to be implicit, that's one of the reasons I proposed using the datasource model, we can easily enforce things like quota, limits and required claim parameters that way IMO.

Copy link
Author

@mhenriks mhenriks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @j-griffith and @tedyu.

I think the biggest thing to decide is whether the user creates the target PVC is created internally or explicitly by the user (via spec.dataSource).

When a controller detects matching request/approval resources, a transfer in initiated.

When transferring a PersistentVolumeClaim, the associated PersistentVolume will be updated,
the source PersistentVolumeClaim will be deleted, and the target PersistentVolumeClaim will be created.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In another comment here, you propose having the triggering the transfer via the pvc spec.dataSource field. In that case do you think it would still be preferable to delete the original claim? And if so, when? When requests/approvals are matched or the target PVC is created?

```

When matching `StorageTransferRequest` and `StorageTransferApproval` resources are detected, the transfer process is initiated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The StorageTransferApproval is not in the same namespace as the target PVC, so the DataSource would have to refer to the StorageTransferRequest.

I agree that this would make the controller simpler. But decided against it because:

  1. its one more thing for the user to do. And what if the target spec doesn't match the source?
  2. the DataSource field is not appropriately ignored/handled for all provisioners at current.

2 is obviously also an issue for the populators, so maybe shouldn't be too concerned about it now? But it could lead to some undesirable behavior in the meantime.


- Make sure the associated PersistentVolume is not deleted/recycled during the transfer by setting reclaim policy to `Retain` (if not already).
- Delete the PVC `source-namespace\foo`.
- Bind The PersistentVolume to `target-namespace\bar`.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, with the datasource model, the user will directly see if creating the target PVC fails because of a quota limitation. Otherwise, it will happen in the background and be reported in the status of the StorageTransferRequest.

What other required parameters are a concern? And does validating the new target PVC require referencing the source PVC to validate compatible parameters? If so, that would be complicated.


// TargetName allows for the source and target resources to have different names
// It is optional
TargetName *string
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense to assume that a typical user of this API has access to both namespaces. If so, binding requests/responses is simpler. Dynamic binding can be a later feature.

@mhenriks mhenriks force-pushed the namespace-transfer branch from f9aad52 to d4071c4 Compare March 15, 2020 20:55
…oval.spec.requestName required parameters

Also, StorageTransferRequest.spec.targetName is required

Also removed some details that may be better for design doc.

Signed-off-by: Michael Henriksen <[email protected]>
@mhenriks mhenriks force-pushed the namespace-transfer branch from d4071c4 to ee28d1f Compare March 15, 2020 20:59
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 13, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 13, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mhenriks
Copy link
Author

/reopen
/remove-lifecycle rotten

@k8s-ci-robot
Copy link
Contributor

@mhenriks: Failed to re-open PR: state cannot be changed. The namespace-transfer branch was force-pushed or recreated.

In response to this:

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Oct 21, 2020
@mhenriks
Copy link
Author

/reopen

@k8s-ci-robot
Copy link
Contributor

@mhenriks: Failed to re-open PR: state cannot be changed. The namespace-transfer branch was force-pushed or recreated.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants