Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Images Sub-specification #2158

Conversation

zachncst
Copy link

@zachncst zachncst commented Nov 5, 2019

Proposing a means of standardizing image locations for all operators to
use.

Description of the change:
Proposal for new operator-sdk feature to add an ImageSpec type and ability to define a list of ImageSpec types on CRDs for an operator. The ImageSpec type would allow the definition of images for lookup in creating common secondary resources. The proposal would include tooling to support the type defined on Operators.

Motivation for the change:
Many operators as for example memcache-example and redis-operator will at the end have CRD's which will represent a statefulset/deployment/pod of an image:tag. As an operator user, I'd like to have a feature to make this process easier.

Proposing a means of standardizing image locations for all operators to
use.
@openshift-ci-robot
Copy link

Hi @zachncst. Thanks for your PR.

I'm waiting for a operator-framework member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 5, 2019
Copy link
Contributor

@camilamacedo86 camilamacedo86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I understood properly this proposal. Note that the resource/object which represents the operator is the operator.yaml and not the CRD's which will be created to represents the pieces/units that are required for all solution.

So, in this way, I cannot see why/how would be possible to add in a CRD all images used for one operator project. What CRD should be used? Note that each operator could have N CRD's.

Also, I'd like to recommend check #2151 and #2132 in order to provide a better explanation over it. So, IMHO it goes against the design and concepts adopted for k8s projects and APIs as operator-framework.

Copy link
Member

@kevinrizza kevinrizza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be careful with what we define in terms of the relationship between Kubernetes types like the CustomResourceDefinition and the Operator pattern. This proposal seems to be implying that there should be some kind of link between the CRD definition itself and a particular containerized application image. But I would strongly argue that no such link needs to exist, and in fact creating such a link severely limits the value of the Operator pattern.

Operators can do much more than that. An Operator defines a first class relationship between a custom type in Kubernetes and a controller. This relationship gives an Operator developer a framework to allow strongly typed custom user input to drive operational logic in a well maintained an upgradeable manner. Yes, you can create a CRD that defines the configuration and deployment of a service or database. But you can also define a control plane of a set of deployments of multiple container images or even a set of other Kubernetes types that are not "application" deployments at all. A common emerging pattern is the "Operator of Operators" which could easily abstract the monitoring or maintenance of multiple operators at once through pure Kube API calls without deploying any sort of "operand".

To be fair, this proposal defining a field like this as optional would certainly not prevent existing users from continuing to use these patterns. But making something like this 1:1 relationship between Operator and Operand a first class citizen limits the awareness of what the Operator pattern is capable of in general.

@camilamacedo86 camilamacedo86 self-assigned this Nov 6, 2019
@camilamacedo86

This comment has been minimized.

@zachncst

This comment has been minimized.

@camilamacedo86

This comment has been minimized.

@zachncst

This comment has been minimized.

@camilamacedo86

This comment has been minimized.

@zachncst

This comment has been minimized.

@camilamacedo86
Copy link
Contributor

camilamacedo86 commented Nov 6, 2019

Hi @zachncst,

I think that your goal is more clear to me. So, I will hide as resolved the comments in order to make easier the review for others ( too much verbose take too long for someone check ) and, then make some comments/suggestions in the doc. Also, let's see what others input as well.

@zachncst
Copy link
Author

zachncst commented Nov 6, 2019

@camilamacedo86 Sounds great, I'd also like to mention I'm happy to help integrate the proposal as well. I'm not just blind firing here. Happy to become a contributor.

@@ -0,0 +1,190 @@
---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest/recommend improve the first comment to open this PR in order to make the goal more clear. Following some ideas.

Description of the change:
Proposal for new operator-sdk future to allow users to inform images and the common secondary resources required for it be created as all logic which would be required to follow up the best practices to manage them.

Motivation for the change:
Many operators as for example x and x will at the end have CRD's which will represent a deployment/pod of
an image:tag. So, I am as oper user I'd like to have a feature to make this process easier.

Copy link
Contributor

@camilamacedo86 camilamacedo86 Nov 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS.: The goal would be in a simple way make clear what is the problem that you are trying to solve with, in order to avoid misunderstands and let it open for contributions too. The community, users and others may bring good ideas.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll go ahead and do this.

@shawn-hurley
Copy link
Member

@zachncst can we update the proposal to include more of the comments. This is what I took away:

  1. We want to create a re-usable type called "ImageSpec" that would have some special library bits and defined openapi definition such that a user of the SDK MAY choose to use this.
  2. That we intend for the process of this type in a CRD to mean something special to scorecard (linter and functional testing) such that we can write specific optional tests. These tests may not be on by default until they are stable (WDYT?)
  3. Update the stories to make it clear that someone can add this field to their operator during creation, potentially with a CLI flag?

I think I might be missing some, but then I would like to see the type that we are defining and what we imagine the library would need to do. I would also like to see example use cases for more than a simple Memcached. Can we show how this would potentially work if I had more than 1 operand image? Is there some sort of default defined? is this an optional field if it is available such that a cluster-admin could set a default somehow?

Overall I think this is a good start to a new feature, just want to flesh out some more of the design in the proposal WDYT?

overridable on the operator custom resource definitions (CRDs). This allows a user of a deployed
operator to override the images needed at deployment of the custom resource
(CR).

Copy link
Contributor

@camilamacedo86 camilamacedo86 Nov 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO: Would make more sense have a feature which makes the process to implement the common requirement to deploy an image and manage its deployment/pod following the good practices instead of make it be mandatory.

Following my suggestion.

  • Add an image flag for operator-sdk add api --api-version=app.example.com/v1alpha1 --kind=AppService --image=ngnix:latest

Then, when it be used the CR/CRD and API with few standard specs will be scaffold.

  • Add an flag tooperator-sdk add controller --api-version=app.example.com/v1alpha1 --kind=AppService --image-gen

Then, it would scaffold the controller with watches and a reconcile impl with the standard/basic code to have a deployment and manage its status.

Reasons:

  • Give flexibility to the users and allow them to change the code according to their needs
  • We will scaffold the code, and then it also can be improved to be integrated/used with other projects as kubebuilder for example
  • A lot of images also will require specs for its env vars. E.g. PostgreSQL requires user, database name, PWD. If we keep it fixed, will it be really useful?

OutOfScope: it could be improved to accept more flags or use a config.yaml file, for example, to define if should or not expose a service, use env vars, should create or not a secret, etc .. For these reasons, we could just start from just a flag as POC but it cannot be fixed/mandatory and etc. It could be improved and grow a lot in the future.

c/c @joelanford

```

#### Add a scorecard test to verify changes to image locations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd keep just the impl to gen the basic code and keep a smaller scope. If it is accepted and implemented, then we can have a better idea over how should be the next steps or additional needs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@zachncst
Copy link
Author

zachncst commented Nov 6, 2019

@zachncst can we update the proposal to include more of the comments. This is what I took away:

1. We want to create a re-usable type called "ImageSpec" that would have some special library bits and defined openapi definition such that a user of the SDK _MAY_ choose to use this.

Yes.

2. That we intend for the process of this type in a CRD to mean something special to scorecard (linter and functional testing) such that we can write specific optional tests. These tests may not be on by default until they are stable (WDYT?)

Yes, agreed.

3. Update the stories to make it clear that someone can _add_ this field to their operator during creation, potentially with a CLI flag?

Yes, exactly, Think @camilamacedo86 highlighted this best in her reply. Something like she described would be ideal.

I think I might be missing some, but then I would like to see the type that we are defining and what we imagine the library would need to do. I would also like to see example use cases for more than a simple Memcached. Can we show how this would potentially work if I had more than 1 operand image? Is there some sort of default defined? is this an optional field if it is available such that a cluster-admin could set a default somehow?

Example is a good idea, do you (@shawn-hurley) know of a good operator example that you could provide so we can work through it?

Defaults are interesting. In later versions of kubernetes CRD validation (1.16), we can just use the OpenAPIv3 default method to provide a starter list. There is also CustomResourceDefaulting in beta. If that isn't supported then the defaults are basically in the example that the user uses to deploy. Or a fallback to the controller having a set when not provided in the CR.

Overall I think this is a good start to a new feature, just want to flesh out some more of the design in the proposal WDYT?

Yep, agreed.

--

What I'm going to do is wait a few hours then take all this feedback and edit the spec to include the details we've worked out.

@camilamacedo86 camilamacedo86 added the kind/documentation Categorizes issue or PR as related to documentation. label Nov 7, 2019
@zachncst
Copy link
Author

zachncst commented Nov 7, 2019

Updates made.

@@ -84,12 +100,35 @@ Implementing the proposal would include:

### User Stories

Copy link
Contributor

@camilamacedo86 camilamacedo86 Nov 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that the proposals in this repo are not exactly in this way. But, IHMO should describe the User Stories by following its concept. E.g 👍

  • As a user, I'd like to inform an image and see the operator-SDK scaffolding the CR/CRD with the common expected specs and validations generated by it.
  • As a user, I'd like to inform an image and see the operator-SDK scaffolding the controller with the watches and reconcile actions implemented to create a Deployment and Service generated by it.

See https://www.mountaingoatsoftware.com/agile/user-stories. Then, we should add acceptance criteria. See here an example over how I think that we should do it.

Note that in the template we have the field ### Implementation Details/Notes/Constraints to describe and add notes over how we are suggesting technically the implementation be done, however, the rest of the doc in POV should be focused in the business rules. It helps us understand the goal and have a good definition of what are the behaviours expected when the proposal be implemented.

information (like pull secrets).
- Increase transparency with what operators are installing.
- Simple, reusable specs to build operators; similar to PodSpec in Kubernetes.

Copy link
Contributor

@camilamacedo86 camilamacedo86 Nov 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May:

Suggested change
- Provide an easy mechanism to allow the operator to generate a scaffold CR/CRD, types and controller managing all common secondary resources expected to have a deployment of an image
- Increase the quality of the operators by leading and help users do its implementation following its design and concepts.
- Reduce the effort or users develop solutions with.


### Non-Goals

- OLM management of the images spec is not in this proposal.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we scaffold the code we will provide the facility and all other features will work for this. So, I do not think that it is required.

Suggested change
- OLM management of the images spec is not in this proposal.

all images used by the operator. An example:

```yaml
apiVersion: app.example.com/v1alpha1
Copy link
Contributor

@camilamacedo86 camilamacedo86 Nov 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO:

  • We should not create a YAML file now with the images and etc.
  • We could use a flag to inform the image and scaffold a common implementation instead of that.

Reasons:

  • Give flexibility for the user do changes
  • Make transparent what the project is dong instead of having a block magic box that does things
  • We should began by doing this process for only 1 image in order to keep it simple and then, after we have it working and all improvements that will happen by following an agile process and its interactions we could see better if it would be good or not allow it to be done for N/(a list of) images by just one action.

Note that this approach will not just allow us to grow with consistency but make easier the process to implement it since if we have it working as expected for 1 image should not be a big deal improve the facility to N if we identify that it is would be a good feature.


### Future Work

#### Validation
Copy link
Contributor

@camilamacedo86 camilamacedo86 Nov 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we scaffold the code all the other features will work as expected and we do not need to add any specific test in the scorecard. IMHO the scorecard should be 100% independent and be allowed to be used in any operator project done with SDK or NOT. I mean, It should ensure the quality of the project no matter how the user implemented it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your scaffold idea moves away from the original proposal and I disagree.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of a scorecard to me is to grade you on how well you have done something. Getting a grade on your operator should include following good practices and standardizations. I think looking for this feature should totally be in scope.

the openapi v3 schema is in beta. Using enum is another method of providing a
default value but requires input by the creator of the CR. Mutating webhooks
is another method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we following my suggestion described above I think we will not have these risks.

- Different operators may already solve this problem in different ways.
- Not all CRDs require an image (adding a user to a database for example).
- Adds a requried data to an operator CRD that is otherwise freeform.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we following my suggestion described above I think we will not have Drawbacks as well.


NA.

## Implementation History
Copy link
Contributor

@camilamacedo86 camilamacedo86 Nov 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to add all topics of the templates.
That ones which are not required or not make sense for the proposal can be removed/ignored.

Suggested change
## Implementation History


NA.

### Version Skew Strategy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Version Skew Strategy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to add all topics of the templates.
That ones which are not required or not make sense for the proposal can be removed/ignored.

default value but requires input by the creator of the CR. Mutating webhooks
is another method.

### Upgrade / Downgrade Strategy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Upgrade / Downgrade Strategy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to add all topics of the templates.
That ones which are not required or not make sense for the proposal can be removed/ignored.

- Directly couples the image versions with the operator.
- Requires OLM support, or custom kubernetes runtimes (CRI-O), to override the image
locations at runtime without operator support.
- Images cannot be overriden by a user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this proposal is accepted here I think we could aligned it with OLM team and move at least at the first moment doing it just in SDK. Also, IHMO it shows to be more part of the SDK scope than OLM.

@zachncst
Copy link
Author

zachncst commented Nov 8, 2019

Thanks for the review @camilamacedo86. I'll be reviewing comments and will address when I can. I will also work up an example project to share so we can talk a little more to a real example and a little less to the prose in this proposal.

- name: example
iamge: my-registry:5000/example:1.0.0
```

Copy link
Contributor

@camilamacedo86 camilamacedo86 Nov 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other reason for this file not be created yet is because of kubebuilder add all API, version kinds generated by in the PROJECT file already as the following example.

version: "2"
domain: my.domain
repo: test
resources:
- group: webapp
  version: v1
  kind: Guestbook
- group: webapp
  version: v1
  kind: Guestbook2

So, after the integration, we might use it to add the images as well. For example.

version: "2"
domain: my.domain
repo: test
resources:
- group: webapp
  version: v1
  kind: Guestbook
- group: webapp
  version: v1
  kind: Guestbook2
  image: image:tag 

NOTE: It is just one idea to illustrate the N possibilities that it may have to be improved, because of this, ì think we should start by just using the flag and then, after all, will be easier improve and/or adopted a kind of config by yaml file, kustomize, and etc ..

@zachncst
Copy link
Author

I'm going to close this PR. I've spoken with some other Operator devs and users and think this can be retooled a bit. I'll bring back an alternate proposal taking the feedback from here. Thanks for reviewing this, it's given me a lot of insight into the team and philosophies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants