-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Volume Snapshot in Kubernetes - Snapshot API Object discussion #44172
Comments
cc @kubernetes/sig-storage-api-reviews @kubernetes/sig-storage-feature-requests @kubernetes/sig-storage-misc |
cc @tsmetana |
Just to be clear, this is a two object split.. correct? There's the "SnapShot" object which has no namespace.. then a "SnapShotRequest" which is a namespaced object.. As we add more imperative actions to PVs, should we have a 'generic' request object that would allow Snapshot, Replication, Resize, Restore, etc... |
And would |
@childsb I don't think having a generic request object is something we want to introduce now (and I suspect we won't need it because nearly everything will be specific to each case). @rootfs I think I even made a mistake by adding the
So owners will be the |
And maybe rename SnapshotRequest to SnapshotSet ? |
@MikaelCluseau a generic PVR (persistent volume request)... could be similar to PV where there's a set of common fields shared and specific fields that are specific to the actionable command. We could still keep the strong typing of fields for each request type. Examples: apiVersion: v1
kind: PersistentVolumeRequest
metadata:
name: snapshot1
spec:
snapshot:
pvc:pvc1 apiVersion: v1
kind: PersistentVolumeRequest
metadata:
name: action-create
spec:
createVolumeFromSnapshot:
snapshot:snapshot1 apiVersion: v1
kind: PersistentVolumeRequest
metadata:
name: action5
spec:
resizePVC:
pvc:pvc1
newSize:25Gi apiVersion: v1
kind: PersistentVolumeRequest
metadata:
name: action6
spec:
restore:
snapshot:snapshot1 |
@childsb your proposal of a generic request is very interesting. But I have a concern about the complexity of the controller. Now with generic request, we will have many different operations from the request, we need to think through how such controller works. |
@childsb to me it's not similar; a PV is an actual thing in the system (a mount point) that the user wants to manipulate. It just happens that this thing has a relatively complex setup because of the diversity of source and the lack of standardization in the storage industry. I think we can all agree that the PV object would be much better if we could simply have expressed the source as a path in the storage system instead of a different thing for each storage tech out there. Here, the actual thing we manipulate is a snapshot (hence the mistake in the name SnapshotRequest, I should definitely have said SnapshotSet). In terms of consistent UX, we have StatefulSets and Deployments, not a generic object even though they share a common generic logic that's materialized as a ReplicaSet. So, IMHO, if we need something generic at some point, I really think it shouldn't be a first class object. |
@childsb, this is not how Kubernetes works. It's intent based. https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md#types-kinds
PersistentVolumeRequest is not an persistent entity in the system, it's command. To create a snapshot user should create an object representing the snapshot and Kubernetes will create it. To resize a volume, user modifies the volume and Kubernetes will try to resize it. It's complicated by security aspects, we don't show PVs to users, but that's why there is a claim. Users should resize claims and Kubernetes would try to make it happen in real objects outside of Kubernetes. Similarly with snapshots. |
@jingxu97, IMO there needs to be an object that represents single snapshot and is in user namespace so the user can see what's its status and we can have quota around it. I don't have a good name for it, I'm going to use SnapshotClaim in this comment, but I know it's ugly (user is not claiming a snapshot, he's creating it...) Any object that creates multiple snapshots, say SnapshotSet, should be IMO a separate If there should be a Snapshot object in the global namespace is another discussion, I have nothing against it, however these should not be visible to users, similarly as PVs are not visible to users because of security. All regular operations with snapshots must be done by a user creating/modifying/deleting objects in user's namespace. |
It's not that ugly, with dynamic provisionners, PVClaims are actually creating PVs. |
Important is the intent - with PVC, user wants to claim an existing PV. A new one is created only when there is nothing to claim. With SnapshotClaim, user probably does not want to claim an existing snapshot, he really wants to create a new one. |
I actually like the idea of SnapshotRequest to create a Snapshot (object). The one thing I dislike in this proposal is the snapshotting of multiple volumes at once: the result would be multiple objects outside the cluster anyway so having them represented with separated object looks much more intuitive to me. It is also more flexible. I'd rather use some simple mechanism for marking snapshots belonging to one group. |
The important part of "at once" is "consistent" ;-) |
We discussed this use case and we're trying to avoid it in the first version as it needs support in the application. How else do you ensure consistent snapshots of 10 different EBS volumes running say Cassandra? |
I have no problem with pushing the "multiple snapshots at once" outside the scope of a first version. |
fsfreeze is supported only on ext3/4, xfs, reiser and jfs... The proposal should be filesystem-agnostic and as such I don't think we should take responsibility of the the data consistency at all. |
In addition, fsfreeze ensures just filesystem consistency, not consistency of the database stored on the filesystem. It may be corrupted. Anyway, it's fine to have a high-level object like SnapshotSet and we can argue how to make it consistent later, but we need also an object that represents single snapshot in user namespace. The user needs to see what snapshots are available, their states and he/she needs a way how to restore / delete individual snapshots and not whole set. Something like StatefulSet and Pods - both are in user namespace and user can monitor them individually. |
Let's forget SnapshotSet; this applies to a single snapshot too. We won't fix users wishing filesystems without fsfreeze or equivalent support, but not using fsfreeze when possible (99% of the cases?) will cause much more damage. We won't fix in-app cache flushing generically either. Obviously, we'll have to document clearly what we do, and also do the best we can. |
Thank you everyone for the insightful feedback and discussion. I think it
would be good to discuss this in F2F meeting. This snapshot API design is
critical to our snapshot feature at the very first stage. That is why I
want to initiate this discussion first here and gather feedback so that we
could make a conclusion faster hopefully at the meeting. I will try to
summarize the discussion here and update the proposal. Please continue to
send your thoughts and comments about the design. Thank you!
…On Fri, Apr 7, 2017 at 7:42 AM, Mikaël Cluseau ***@***.***> wrote:
In addition, fsfreeze ensures just filesystem consistency, not consistency
of the database stored on the filesystem. It may be corrupted.
Let's forget SnapshotSet; this applies to a single snapshot too. We won't
fix users wishing filesystems without fsfreeze or equivalent support, but
not using fsfreeze when possible (99% of the cases?) will cause much more
damage. We won't fix in-app cache flushing generically either. Obviously,
we'll have to document clearly what we do, and also do the best we can.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#44172 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ASSNxYhD3E4SIGt94iNNcNKjQRrD1vaiks5rtktdgaJpZM4M2Ae9>
.
--
- Jing
|
Can you discuss why we need more than a single object? Argument 1 (that
admins want to do their own backups) isn't enough of a reason to have two
objects.
…On Tue, Apr 25, 2017 at 3:25 AM, Jing Xu ***@***.***> wrote:
There are a few questions have been discussed related to snapshot API
design including whether the API object should be in user namespace or not,
whether there should be two objects such as Request and snapshot (also
snapshotSet). The following is a summary of the points.
*Namespace discussion*
1. Unlike PVs that are created by admins only, snapshots could be
initiated by users or admins. Clearly a snapshot object is used to show the
detailed information and status about the snapshot. If user creates the
snapshots, the objects should be in user namespace so that user can
query/list them. Admins might also create snapshots for the PVs available
in the system for data backup purpose. Those snapshots are not belong to a
specific user namespace and might be used by any user.
2. How snapshots are used is quite unique. User might want to create
or restore volumes from the snapshots they have taken. It is also possible
that they want to use the snapshots taken by the admins. Also considering
data replication and some other scenarios, snapshots taken from a namespace
might be needed in another namespace.
[image: volume snapshot in kubernetes]
<https://cloud.githubusercontent.com/assets/19172805/25373099/198846f4-294d-11e7-9185-577d102a4a72.jpg>
Now we propose a two-object API design. Similar to PVC/PV, snapshot has
two related objects, VolumeSnapshotRequest and VolumeSnapshot.
1. VolumeSnapshotRequest (VSR) is a request for snapshot by a user.
Similar to PVC, VSR is a usernamespaced object, which provides a way to
request or consume snapshot resources.
2. VolumeSnapshot (VS) is a representation of the snapshot taken from
a volume. This API object captures the details of the implementation of the
snapshot. Similar to PV, VS is a non-namespaced object. There are two ways
to create VS: statically or dynamically.
*Static:*
If there are existing snapshots created outside of kubernetes, an
administrator can create VSes to represent details of the snapshots which
are available for use by cluster users. Users create VSRs in their
namespaces and point them to the corresponding VSes. VSR becomes a handler
to access those snapshots by a user.
*Dynamic*
User could also user VSR to create snapshots from a volume on-demand.
VSR specifies which volume the snapshot should be taken from. In such case,
a corresponding VS is created. VSR and VS will have pointers to each other.
*Many-to-one mapping between VSR and VS*
There could be use cases that snapshots are needed across different
namespaces, e.g. for data replication. To achieve this, here we propose to
support many-to-one mapping relationship between VSR and VS. There is a
field “allowedNamespaces” in VS/VSR object to specify in which namespaces
the VSRs could be created for the snapshot. For static creation, system
admins can add the list of namespaces. In dynamic creation, user specifies
the namespace list which will be propagated to the VS.
If a user want to access VS from a different namespace, he/she could
create a VSR to reference the VS from the namespace as long as it is in the
“allowedNamespaces” list. In this way, VSR and VS is many-to-one mapping.
*Differences between PVC/PV and VSR/VS*
Binding between PVC/PV requires atomic transaction to avoid race condition
causing multiple PVCs bind to the same PV. PVC and PV is one-to-one mapping
so that we have to make sure only one PVC could bind to the PV. However,
snapshots does not have this requirement so that the binding between VSR
and VS does not need to be atomic.
Since VSR/VS is many-to-one mapping, delete a VSR is not necessary trigger
a deletion for VS unless this VSR is the last reference to it.
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
<#44172 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p2j28fQaB1whn2r30VyA67IYLuc_ks5rzZ_hgaJpZM4M2Ae9>
.
|
For instance - every user requesting a snapshot is requesting that
independently. Unless the user sets preconditions, the system can very
well coalesce multiple snapshot requests at the same time. The snapshot
should be suitable for creating new PVCs from via dynamic provisioning.
But I still don't see the requirement for two objects.
On Tue, Apr 25, 2017 at 2:16 PM, Clayton Coleman <[email protected]>
wrote:
… Can you discuss why we need more than a single object? Argument 1 (that
admins want to do their own backups) isn't enough of a reason to have two
objects.
On Tue, Apr 25, 2017 at 3:25 AM, Jing Xu ***@***.***> wrote:
> There are a few questions have been discussed related to snapshot API
> design including whether the API object should be in user namespace or not,
> whether there should be two objects such as Request and snapshot (also
> snapshotSet). The following is a summary of the points.
>
> *Namespace discussion*
>
> 1. Unlike PVs that are created by admins only, snapshots could be
> initiated by users or admins. Clearly a snapshot object is used to show the
> detailed information and status about the snapshot. If user creates the
> snapshots, the objects should be in user namespace so that user can
> query/list them. Admins might also create snapshots for the PVs available
> in the system for data backup purpose. Those snapshots are not belong to a
> specific user namespace and might be used by any user.
> 2. How snapshots are used is quite unique. User might want to create
> or restore volumes from the snapshots they have taken. It is also possible
> that they want to use the snapshots taken by the admins. Also considering
> data replication and some other scenarios, snapshots taken from a namespace
> might be needed in another namespace.
> [image: volume snapshot in kubernetes]
> <https://cloud.githubusercontent.com/assets/19172805/25373099/198846f4-294d-11e7-9185-577d102a4a72.jpg>
>
> Now we propose a two-object API design. Similar to PVC/PV, snapshot has
> two related objects, VolumeSnapshotRequest and VolumeSnapshot.
>
> 1. VolumeSnapshotRequest (VSR) is a request for snapshot by a user.
> Similar to PVC, VSR is a usernamespaced object, which provides a way to
> request or consume snapshot resources.
> 2. VolumeSnapshot (VS) is a representation of the snapshot taken from
> a volume. This API object captures the details of the implementation of the
> snapshot. Similar to PV, VS is a non-namespaced object. There are two ways
> to create VS: statically or dynamically.
> *Static:*
> If there are existing snapshots created outside of kubernetes, an
> administrator can create VSes to represent details of the snapshots which
> are available for use by cluster users. Users create VSRs in their
> namespaces and point them to the corresponding VSes. VSR becomes a handler
> to access those snapshots by a user.
> *Dynamic*
> User could also user VSR to create snapshots from a volume on-demand.
> VSR specifies which volume the snapshot should be taken from. In such case,
> a corresponding VS is created. VSR and VS will have pointers to each other.
>
> *Many-to-one mapping between VSR and VS*
> There could be use cases that snapshots are needed across different
> namespaces, e.g. for data replication. To achieve this, here we propose to
> support many-to-one mapping relationship between VSR and VS. There is a
> field “allowedNamespaces” in VS/VSR object to specify in which namespaces
> the VSRs could be created for the snapshot. For static creation, system
> admins can add the list of namespaces. In dynamic creation, user specifies
> the namespace list which will be propagated to the VS.
> If a user want to access VS from a different namespace, he/she could
> create a VSR to reference the VS from the namespace as long as it is in the
> “allowedNamespaces” list. In this way, VSR and VS is many-to-one mapping.
>
> *Differences between PVC/PV and VSR/VS*
> Binding between PVC/PV requires atomic transaction to avoid race
> condition causing multiple PVCs bind to the same PV. PVC and PV is
> one-to-one mapping so that we have to make sure only one PVC could bind to
> the PV. However, snapshots does not have this requirement so that the
> binding between VSR and VS does not need to be atomic.
> Since VSR/VS is many-to-one mapping, delete a VSR is not necessary
> trigger a deletion for VS unless this VSR is the last reference to it.
>
> —
> You are receiving this because you are on a team that was mentioned.
> Reply to this email directly, view it on GitHub
> <#44172 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/ABG_p2j28fQaB1whn2r30VyA67IYLuc_ks5rzZ_hgaJpZM4M2Ae9>
> .
>
|
I think snapshots for data replication will be very useful for StatefulSets and that means that data stays in the same namespace. I am not sure of the security implications of cross-namespace snapshots and the proposed mechanism needs to be vetted by @kubernetes/sig-auth-api-reviews folks. I share Clayton's thoughts, why can't the flow simply include a single object and as a user I create such an object (volumeSnapshotRequest or snapshotRequest) and get back a PVC bound to the data I asked a snapshot for? Admin backups can still be PVs, no? |
Hi Michall,
Could you please elaborate about "get back a PVC bound to the data I asked
a snapshot for"? When user requests to create a snapshot (I assume the
snapshot request specifies a PVC to indicate which volume the snapshot
should be taken), does it mean it also create a volume and PVC bound this
this volume, or PVC bound to this snapshot?
…On Wed, Apr 26, 2017 at 1:46 AM, Michail Kargakis ***@***.***> wrote:
I think snapshots for data replication will be very useful for
StatefulSets and that means that data stays in the same namespace. I am not
sure of the security implications of cross-namespace snapshots and the
proposed mechanism needs to be vetted by @kubernetes/sig-auth-api-reviews
<https://github.com/orgs/kubernetes/teams/sig-auth-api-reviews> folks.
I share Clayton's thoughts, why can't the flow simply include a single
object and as a user I create such an object (volumeSnapshotRequest or
snapshotRequest) and get back a PVC bound to the data I asked a snapshot
for? Admin backups can still be PVs, no?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#44172 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ASSNxeu2ExFE7omC7l06dFdC7K7L48Aiks5rzwRcgaJpZM4M2Ae9>
.
--
- Jing
|
The outcome of the snapshotRequest is a new PV ("the data") bound to a new PVC. The new PVC is what I get back in the namespace I did the snapshoting. |
SnapShots are NOT volumes. When dealing with a storage array once can create a snapshot of a volume. In many storage arrays a 'snapshot' could in fact just be a diff from some previous state. One can later create a volume from that snapshot. We plan to represent that similarly in kube. Let me walk through a flow at a high level:
I had a discussion with clayton and tend to agree. It is bikeshedding but important bikeshedding. Lets change the object names. Users should create something called a (In kube -v2 I think we should make it PV and VolumeRepresentation or something like that) |
@thockin and I had previously agreed that the "PersistentVolumeClaim" and
"PersistentVolume" naming was incorrect - what an end user saw should be
"PersistentVolume" (because that's how the user thinks of it), and the
lower level object (what an admin sees) can be more technical, like
"PersistentVolumeDefinition" or similar.
So I'm ok with the two object model - but as Eric describes I really want
to ensure we focus on the human focused name first (for the thing everyday
users deal with), and the admin focused object can be more precise and
technical and less human-friendly. If a user wants to take a snapshot,
they create a Snapshot.
…On Thu, Apr 27, 2017 at 2:27 PM, Eric Paris ***@***.***> wrote:
SnapShots are NOT volumes. When dealing with a storage array once can
create a snapshot of a volume. In many storage arrays a 'snapshot' could in
fact just be a diff from some previous state. One can later create a volume
from that snapshot. We plan to represent that similarly in kube.
Let me walk through a flow at a high level:
1. User creates a PVC which creates a PV
2. User uses this PVC like always, nothing interesting yet.
3. User creates a SnapShotRequest which points to a PVC
4. System triggers the snapshot creation action on the storage array.
System also creates a SnapShot API object with data from the storage array
which is bound to the SnapShotRequest.
5. At some later point a user might create a new PVC which points to
the SnapShotRequest
6. System triggers the snapshot->volume action on the storage array.
System also creates a new PV API object from the storage array result bound
to the new PVC
7. User uses this new PVC like always, nothing interesting anymore.
------------------------------
I had a discussion with clayton and tend to agree. It is bikeshedding but
important bikeshedding. Lets change the object names. Users should create
something called a Snapshot. The system should create an object with a
less human friendly name, maybe SnapshotRepresentation or something. We
don't call it a PodRequest. We don't call it an IngressRequest. The user
visible name should be Snapshot. I think we messed that up for PVs and
think we should get it right here. I don't really care what we call the
other half. But I do feel pretty strongly about the user facing name.
(In kube -v2 I think we should make it PV and VolumeRepresentation or
something like that)
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
<#44172 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p_nOUG46gTFuNrAOv96OO2ePv_2mks5r0N4dgaJpZM4M2Ae9>
.
|
@eparis Re 5. At some later point a user might create a new PVC which points to the SnapShotRequest: |
The user is making a request to link a high level object under his
control to another high level object. The lower level details don't
really matter to her (ie it doesn't matter there is a PV or some tar
file on S3). The user should be able to request new PVCs based on
existing snapshots that have been successfully captured (or request a
new pvc pointing to a new snapshot at the same time and expect Kube to
fulfill that).
|
Thanks for the discussion. We're moving forward. We will use two objects and it also looks like we have agreed to rename |
I have been thinking more about this two-object proposal and PV/PVC relationship and I feel that that two-object model are necessary for pre-existing volume case, but not for dynamic provisioning case.
Go back to snapshot, similarly, depending on the use cases, two objects might be needed or not.
I think we need to be clear about what use cases we plan to support, and whether the use cases are reasonable to make the decision. If case 2 is not something we want or should support reasonably, I do feel two objects are not very necessary. Although we are saying it is similar to PV/PVC case, but snapshot is closer to dynamic provisioning case of PVC/PV which does not really require two objects by itself. |
I don't have that strong preference for the two-object model as it may seem. I originally proposed one object model too. The two object model emerged from the upstream discussion for the reasons I have mentioned earlier: mostly it looks to be less limiting for future development.
This I believe is the important point: The two-object model is unnecessary in most cases. However the data replication use case comes popping up in every discussion. The two-object model actually tells users we're not writing the use case off and we want to try to resolve it. I'm also not sure how feasible or reasonable it is... |
I question whether scheduled create or delete should be in the API. Once you enable snapshots to be created by API, you can implement scheduled snapshots, including periodically scheduled snapshots. SO I don't think this level of detail needs to be, or should be called out unless you propose that the scheduled snap mechanism is part of Kubernetes itself, which I do not think is a requirement, now, and perhaps forever. Warning: if you declare yourself to be the “one true standard” for implementing this, be advised than with complexities such as timezones for one, this is a huge undertaking. Also need to API expose snapshot delete API BTW in order to make this useful for common applications. And obviously and API to ascertain the count and identity of snapshots is also required. In order to support applications engaing in recurring snapshot create/delete, it is desirable to allow such an application to appy a label or other unique searchable/querable identifier to allow determination of whether your application created a snapshot. At a practical level you don't want to have a backup service deleting an admin created snapshot simply because it can't discriminate based on who creted the snapshot. In order to support application initiated (as presumable application consistent) snapshots it would be desirable to have a mechanism to enable/disable this (by either pod or container), with an API authentication mechanism that simply identifies the request as coming from within the pod/container without requiring further credentials - the ask here is to enable making snapshot create I am worried that support of a list of volumes in the snapshot create api will be erroneously interpreted as an expectation that these snaps would be captured at a “singular and common” point in time. Few (and perhaps zero) underlying storage implementations are capable of delivering this in a Kubernetes context. I question whether bundling “create new volume from snapshot” into the SnapshotRequest is desirable. Common use cases exist for creating a volume from a snapshot long after the original snapshot was created. If as a developer you decide you want to do this, I think finding the functionality in something labeled SnapshotRequest is unexpected. |
@cantbewong wrote: Common use cases exist for creating a volume from a snapshot long after the original snapshot was created. If as a developer you decide you want to do this, I think finding the functionality in something labeled SnapshotRequest is unexpected |
@jingxu97 even in case 1, how do you hide the storage access keys from the user without 2 objects? ie, accessing a Ceph snapshot will require a monitor list, a key, a pool, etc. I don't want my users to have access to that. |
Could subresource be used here to hide information from users? As Jan
mentioned before
"I checked Kubernetes and it is possible to have a namespaced Snapshot
object with a subresource (SnapshotStorage?). Access to the subresource
is subject of RBAC, so we can restrict users in creating, editing or
even getting the subresource and let only a controller or admin access it.
I checked Kubernetes sources and subresources are widely used, e.g. for
Status. However, both e.g. namespaces/myns and namespaces/myns/status
lead to the *same* object in etcd, it's a full Namespace object with
both Namespace.Spec and Namespace.Status. To be honest, I don't
understand this approach.
So, it is possible to have something like
api/v1/namespaces/myns/snasphots/mysnapshot with Snapshot object,
accessible by user and
api/v1/namespaces/myns/snasphots/mysnapshot/storage with SnapshotStorage
object with coordinates of the snapshot in AWS/GCE; accessible by
admin/controllers. We would be probably the first to do that and I'd
like to ask @bgrant, @etune or @lavalamp if it's the right approach."
…On Tue, May 2, 2017 at 10:30 PM, Mikaël Cluseau ***@***.***> wrote:
@jingxu97 <https://github.com/jingxu97> even in case 1, how do you hide
the storage access keys from the user without 2 objects? ie, accessing a
Ceph snapshot will require a monitor list, a key, a pool, etc. I don't want
my users to have access to that.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#44172 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ASSNxTn1NYjvc6thTVchqU7X5VmyyORBks5r2BEHgaJpZM4M2Ae9>
.
--
- Jing
|
You have the whole subject really well in mind :-) So this is a technical solution implying RBAC on subresources. It's 1 object in etcd, but 2 ones in the API anyway. Pure opinion here: I find the subresource approach much more complex to understand for the admin/user; I mean, why should I do |
I think so far the workflow we want to support the most is that users
create snapshots and use them. In this way, snapshot represent user's data
which should be in user's namespace.
About Ceph, from document I think only Ceph block device support snapshots
and kubernetes currently does not support block device. For snapshot
command, it mentioned that "you must specify a user name or ID and a path
to the keyring containing the corresponding key for the user". Isn't it
something that user actually have to provide to taking a snapshot?
I might be wrong, but I guess you are more focusing on system admin taking
snapshot case. I wondered if system admin creates a Ceph snapshot, is that
possible for user to access it?
…On Tue, May 2, 2017 at 11:37 PM, Mikaël Cluseau ***@***.***> wrote:
You have the whole subject really well in mind :-) So this is a technical
solution implying RBAC on subresources. It's 1 object in etcd, but 2 ones
in the API anyway. Pure opinion here: I find the subresource approach much
more complex to understand for the admin/user; I mean, why should I do kubectl
get snapshotstorage --all-namespaces (and deduplicate the output if a
snapshot is shared between namespaces) to get a list of actual "real world"
objects that are not namespaced by nature? I completely fail to see any
gain and still see a potential danger in forcing such an artificial
property. In particular, we're kind of locking the snapshots to their
namespace while they are not in reality.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#44172 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ASSNxfy4WMWBOM0DbOzupwGuTQFkztF5ks5r2CCpgaJpZM4M2Ae9>
.
--
- Jing
|
Kubernetes does support Ceph RBD (Rados Block Device), it's what I use :) https://github.com/kubernetes/kubernetes/tree/master/examples/volumes/rbd. The ceph workflow is around this:
The key/user are usually those given to the "volume manager" (ie cinder in openstack). So, if "the user" is Kubernetes, then yes, the user has to provide its key to make operations on the images. But if "the user" is the Kubernetes' user, then she doesn't need to provide this information (there's no link between the Ceph's user and the Kubernetes' one). So, Kubernetes will have access to the snapshots it does, whatever the namespace or any Kubernetes' concept is involved. In the PV/PVC split, the key, mons and pool args are in the PV, provided for instance by the provisionner. |
@MikaelCluseau good point. So in rbd PV/PVC, there are two keys: admin and user. Admin key is used to create and delete volume while user key is used to map/unmap rbd image. We can use the same idea here: a admin key in its own namespace and let controller use it to snap/clone. |
/cc @skriss |
I had a long chat with @jingxu97 about this today. She asked me to summarize here. I haven't read this whole issue, so apologies in advance if I misunderstand or state what has already been said. In Kubernetes RBAC, it is desirable, and I think common, to give a user within a namespace permission to access all resources of all objects within that namespace, including subresources. (Details: this happens by giving that user the So, there is no way to prevent at this namespace-admin user from having access to all subresources of a namespaced Therefore, there is no way from preventing someone in namespace (You might ask why not change the role definition. The problem is that then the role definition would either have to have an explicit deny in it (which RBAC intentionally does not support because it has other problems) or the role would have to enumerate every kind and subresource except the one sensitive one, which is not possible, since the set of kinds is unbounded.) The best approach, from our conversation, seemed to be either the two Kinds approach or storing some private-to-the-controller information in an external data store (effectively two objects). |
Is it an option, at least initially if not permanently, to disallow all editing of a snapshot? In a way, this is consistent with the basic concept of snap-shotting... |
Thanks for the updates. I think the two objects would be the safest path forward then: it looks to me it might be really more extensible in the future... |
On further thought, my comment above may not be accurate. I've asked the RBAC experts in sig-auth for clarification. |
I spoke with Sig Auth. The leads have a different view of how the namespace admin role is going to be extended than I had. Their view is consistent with the subresources approach. So, I retract my previous objection. Apologies if this slowed things down. |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Inspired by @MikaelCluseau's suggestion, The following is a list of discussions related to Snapshot API object design.
Use new API objects for snapshot or not
I think having new API objects to represent snapshots will be very convenient to support different operations on snapshots. The possible operations could be create snapshot from volume, create new volume or restore volume from snapshot, list and delete snapshots, show the status (available, pending etc) and detailed information (creation time, size) of a snapshot. So it would gives us more flexibility compared to just tagging the snapshot information into a PV.
How snapshot is created
Although it is possible to use just PVC to initiate a snapshot creation, I think using a new API object (SnapshotRequest) to represent snapshot creation gives much more flexibility and support more options.
a). It could support both one-time snapshot creation or scheduled snapshot with a lot more snapshot policy options. For example, how many snapshots could be taken periodically for a volume. A snapshot should be deleted after a certain amount of time to save space. How much storage space could be used by snapshots etc.
b). It could support to create snapshots for a list of volumes. For system admins, they might want to create snapshots for all volumes in the system overnight for backup purpose. Instead of specifying only one snapshot per PVC, they could use wild card to list all volumes in the snapshot request.
c). It could support to create snapshot for pod, statefulset, deployments etc. Users might not have direct information about the PVs for their pods. In snapshot request, the sources to create snapshots could be PVs, or pod, statefulset and kubernetes can retrieve the volumes information from the pod objects and take snapshots for the user.
How to create a new volume and restore a volume from a snapshot.
Currently kubernetes support dynamic provisioning to create a new volume for a PVC. To create a new volume from a snapshot is similar to this process. I am thinking again to use SnapshotRequest to perform this task by giving the snapshot identifier and some keyword to indicate to create a new volume or restore the existing volume. If a new volume is created, a new PV object will be created to represent this volume. In case of restore, the existing PV still represents the reverted volume. I think this way is cleaner than mixing this function into PVC which requires modifying the PVC object.
Namespace or not.
Similar to PVC and PV, we could make SnapshotRequest as namespaced, and snapshots object as non-namespaced so that user/developer can use request.
@MikaelCluseau's Example
Any comments and suggestions for this would be appreciated!
The text was updated successfully, but these errors were encountered: