Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose KEP to extend allowed DataSource entries to include PVC #642

Merged
merged 2 commits into from
Apr 9, 2019

Conversation

j-griffith
Copy link
Contributor

Propose extending DataSource field in PVCs to allow existing in
namespace PVCs (Clones) and external data population CRDs (Populators).

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 3, 2018
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/pm sig/storage Categorizes an issue or PR as relevant to SIG Storage. labels Dec 3, 2018
@thockin
Copy link
Member

thockin commented Dec 4, 2018

@jingxu97 and @saad-ali showed me some of this last week. I think cloning a disk is legitimate and self-evident use-case. Focus on that.

The idea of software initializers is interesting, but can mostly be solved with sidecars, right? Sidecars have the distinct advantage of being active - they can resync if you need, which these initializers can never do. I'm not super inclined to go out of our way to support initializers that could just be sidecars. If they happen to fall out of the clone design, I won't go out of my way to break them, but I'd need to be convinced to lift a finger to support them.

@j-griffith
Copy link
Contributor Author

@thockin thanks for the feedback. Fore sure, the initializer portion could be a side-car; the appealing thing about the DataSource in this case for me was simply a way to formally/consistently express that, and follow the pattern that the Snapshot Controller uses (and well, because it was there).

I like the side car idea, I'll have to research a bit to understand how I would express that though. Regardless, I'll focus on Cloning for now.

@j-griffith j-griffith force-pushed the extend_volume_source_kep branch from fb0ad4f to 7a2a6b8 Compare December 13, 2018 22:38
@j-griffith j-griffith changed the title Propose KEP to extend allowed DataSource entries Propose KEP to extend allowed DataSource entries to include PVC Dec 13, 2018
@j-griffith
Copy link
Contributor Author

@thockin I updated the PR to remove the populator component and just focus on the cloning use case. I'll follow up with a single proposal for the init/populator use cases to keep things focused.

Copy link
Contributor

@mattfarina mattfarina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick process note, this is an API change which means it needs to go through the API review process.

Has this been discussed at SIG Storage, yet?

reviewers:
- TBD
approvers:
- TBD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please have approvers listed? I think this would be the SIG Storage chairs/tech leads but may review this guidance later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added


The external risk to adding PVC as a valid DataSource is minimal, there are also minimal risks or impacts associated with the addition of the new allowed type. The bulk of the needed changes needed are in the Provisioner and of course the implementation via the CSI plugins themeselves.

## Graduation Criteria
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template for graduation criteria notes:

How will we know that this has succeeded? Gathering user feedback is crucial for building high quality experiences and SIGs have the important responsibility of setting milestones for stability and completeness. Hopefully the content previously contained in umbrella issues will be tracked in the Graduation Criteria section.

Aside from the basic requirements which are things being implemented, how can we know this GA quality vs beta quality?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added additional wording, thanks!

@j-griffith
Copy link
Contributor Author

@mattfarina

Quick process note, this is an API change which means it needs to go through the API review process.
Has this been discussed at SIG Storage, yet?

Yes, we've discussed this at length in sig-storage

@j-griffith j-griffith force-pushed the extend_volume_source_kep branch from 7a2a6b8 to 5ca8f61 Compare January 7, 2019 18:30
@mattfarina
Copy link
Contributor

@j-griffith @saad-ali @childsb What was the output of the conversation in SIG Storage?

Copy link
Member

@justaugustus justaugustus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove any references to NEXT_KEP_NUMBER and rename the KEP to just be the draft date and KEP title.
KEP numbers will be obsolete once #703 merges.

@j-griffith j-griffith force-pushed the extend_volume_source_kep branch from 5ca8f61 to aa0e1f8 Compare January 25, 2019 23:57
@k8s-ci-robot k8s-ci-robot added the do-not-merge/blocked-paths Indicates that a PR should not merge because it touches files in blocked paths. label Jan 25, 2019
@j-griffith j-griffith force-pushed the extend_volume_source_kep branch from aa0e1f8 to f7e14e8 Compare January 26, 2019 01:25
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/blocked-paths Indicates that a PR should not merge because it touches files in blocked paths. label Jan 26, 2019
@justaugustus
Copy link
Member

/uncc

@j-griffith
Copy link
Contributor Author

j-griffith commented Feb 8, 2019

@thockin I've reworked this a bit based on discussions in sig-storage yesterday and as per recommendations from @saad-ali

The proposal removes any reference to "populators", and instead just proposes removing API validation checks against the PVC DataSource 1 (which was apparently the original intent of the field).

Currently works with old style provisioners requires 2 to work with CSI provisioners

@j-griffith j-griffith force-pushed the extend_volume_source_kep branch from 8ba2a21 to f3f1944 Compare February 21, 2019 20:27
@j-griffith j-griffith force-pushed the extend_volume_source_kep branch from f3f1944 to 85d3354 Compare February 21, 2019 20:49
@saad-ali
Copy link
Member

We should finalize on the end to end plan for how generic populators will work before relaxing the validation. Specifically we should finalize on how PVCs will be safely mounted to populators (but no other workloads) until the populator is done.

@j-griffith
Copy link
Contributor Author

@saad-ali

Specifically we should finalize on how PVCs will be safely mounted to populators (but no other workloads) until the populator is done.

WRT to this KEP I disagree, I specifically removed any mention of Populators, I just want to enable PVC's for cloning at this point.

Additionally; why wouldn't we use the same taint mechanism we've agreed upon for snapshots? I've somewhat given up on Populators (at least for the short term) but I still think that we should get Clones working. After all it's in the CSI Spec, It doesn't introduce any risk that I can think of, and it's a pretty common back end storage feature.

I'd just file a bug and add only PVC, but I can't do that because of the API validation checks without a KEP.

@rollandf
Copy link

LGTM

Propose extending DataSource field in PVCs to allow existing in
namespace PVCs (Clones).
@j-griffith j-griffith force-pushed the extend_volume_source_kep branch from 85d3354 to a36f891 Compare February 25, 2019 20:18
@j-griffith
Copy link
Contributor Author

@saad-ali @thockin Should be explicitly Clone only now and put the other things in the "non goals" category explicitly, thanks!

keps/sig-storage/20181111-extend-datasource-field.md Outdated Show resolved Hide resolved

## Motivation

Features like Cloning are common in most storage devices, not only is the capability available on most devices, it's also frequently used in various use cases whether it be for duplicating data or to use as a disaster recovery method.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloning is a common feature on most storage devices. It is used for duplicating data and, in some cases, as a disaster recovery method.

#### Story 4
As a cluster admin or user, I want to be able to provide the equivalent of data templates to users in the Cluster to ensure consistent and secure data sets

### Implementation Details/Notes/Constraints [optional]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's specify that this functionality will only be added for CSI. In-tree volume plugins and Flex drivers will not get this functionality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be added in non-goals or constraints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and added it non-goals, and also put a note in the Implementation Details/Notes section to just stress the point. Thanks


### Risks and Mitigations

The primary risk of this feature is a Plugin that doesn't handle Cloning in a safe way for running applications. It's assumed that the responsibility for reporting Clone Capabilities in this case is up to the Plugin, and if a Plugin is reporting Clone support that implies that they can in fact Clone Volumes without disrupting or corrupting users that may be actively using the specified source volume.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: plugin -> CSI Driver

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, added csi driver and csi plugin so there shouldn't be any question going forward

* API changes allowing specification of a PVC as a valid DataSource in a new PVC spec
* Implementation of the PVC DataSource in the CSI external provisioner

The API can be immediately promoted to Beta, as it's just an extension of the existing DataSource field. There are no implementations or changes needed in Kubernetes other than accepting Object Types in the DataSource field. This should be promoted to GA after a release assuming no major issues or changes were needed/required during the Beta stage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the PVC Datasource field is gated by the VolumeSnapshotDataSource alpha feature gate.
Going forward it makes sense to add additional feature gates for each type of datasource, e.g. cloning.
But I'm not sure that cloning should get to skip alpha?

Copy link
Contributor Author

@j-griffith j-griffith Mar 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can change that. It just seemed that since all we were doing was adding a feature gate for PVC to be specified in DataSource that it shouldn't take a full alpha and beta to be comfortable with it. I'll remove the "skip to beta" verbage and we can go from there. If we feel more comfortable with an alpha then beta release that's great.


## Alternatives [optional]

Snapshots and Clones are very closely related, in fact some back ends my implement cloning via snapshots (take a snapshot, create a volume from that snapshot). Users can do this currently with Kubernetes, and it's good, however some back ends have specific clone functionality that is much more efficient, and even for those that don't, this proposal provides a simple one-step process for a user to request a Clone of a volume.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we recommend against storage systems implementing it this way? I.e. do we want to provide storage systems constraints on how and when to implement clone?

Copy link
Contributor Author

@j-griffith j-griffith Mar 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure honeslty; I used this as an alternative example because it came up :) The only reason I could see any storage provider using this as a clone implementation is because they don't offer cloning on their backend; in which case it's not ideal, but "ok"; they still eliminate the multi step process for the end user.

I think honestly I should've reworded this completely and left out the implementation part; and just pointed out that it's a possible alternative to get the clone behavior without this KEP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the clone api should be a common/single interface exposed to the user and storage provider/driver can do it by their choice in the backend. ie from snapshot or directly by their clone api.

@j-griffith
Copy link
Contributor Author

@saad-ali Now that we're heading into a new release and I've removed all wording regarding populators and just discuss Clone only; any chance we can get this approved/merged so I can get the implementations done here and in the provisioner hopefully for this next release?

@j-griffith
Copy link
Contributor Author

@justaugustus I squashed your request for changes so can't really fix things up; any chance I could get you to take a look and either update or remove your review change request?

@justaugustus justaugustus dismissed their stale review March 24, 2019 12:01

Dismissing as KEP number is removed.

@justaugustus
Copy link
Member

/uncc

@j-griffith
Copy link
Contributor Author

@saad-ali @thockin Any chance I could get either of you to comment on the updates and get this moving along again?

Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

We can implement alpha of this in Q2 for Kubernetes 1.15.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 9, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: j-griffith, saad-ali

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 9, 2019
@k8s-ci-robot k8s-ci-robot merged commit 8a08ffe into kubernetes:master Apr 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants