This document offers high level information about OpenShift ClusterOperators and Operands.
It also provides information about developing with OpenShift operators and the OpenShift release payload.
When updating READMEs in core OpenShift repositories, we realized there is overlap of content.
We've created this document to serve as a common link for READMEs to answer What is an Operator?
and
How do I build/update/test/deploy this code?
.
Many OpenShift Operators were built using openshift/library-go's framework and also utilize openshift/library-go's build-machinery-go, a collection of building blocks, Makefile targets, helper scripts, and other build-related machinery. As a result, there are common methods for building, updating, and developing these operators. When adding features or investigating bugs, you may need to swap out a component layer of a release payload for a locally built one. Again, the process for doing this is shared among OpenShift operators, as well as the process of updating CVO-managed deployments in a cluster.
OpenShift deploys many operators on Kubernetes. Core OpenShift operators are ClusterOperators (CO). You can list them with this:
$ oc get clusteroperators
To get a description of a CO run
$ oc describe co/clusteroperatorname
For simplicity, ClusterOperator resources only include negative conditions (i.e. if a condition required by an operator is not met). When troubleshooting a degraded cluster operator, it may be helpful to see all conditions, negative and positive. This can be observed by checking the description of the corresponding operator custom resource:
$ oc describe clusteroperatorname.operator.openshift.io/cluster
Operators are controllers for a Custom Resource. Operators automate many tasks in application management, such as deployments, backups, upgrades, leader election, reconciling resources, etc.
The Custom Resource Definitions in OpenShift can be viewed with this:
$ oc get crds
For a particular CRD (ex: openshiftapiservers.operator.openshift.io):
$ oc explain openshiftapiservers.operator.openshift.io
For more information, check out the Operator pattern in Kubernetes.
While ClusterOperators manage individual components (Custom Resources), the Cluster Version Operator (CVO) is responsible for managing the ClusterOperators.
The component that an operator manages is its operand. For example, the cluster-kube-controller-manager-operator CO manages the cluster-kube-controller-manager component running in OpenShift. An operator can be responsible for multiple components. For example, the CVO manages all of the COs (in this way ClusterOperators are also operands).
To get a list of the components and their images that comprise an OpenShift release image, grab a release from the openshift release page and run:
$ oc adm release info registry.ci.openshift.org/ocp/release:version
If the above command fails, you may need to authenticate against registry.ci.openshift.org
.
If you are an OpenShift developer, see authenticating against ci registry
You'll notice that currently the release payload is just shy of 100 images.
You can check on the status of any ClusterOperator with
$ oc get co/clusteroperatorname -o yaml
At any time the staus of a CO may be:
- Available
- Degraded
- Progressing
CVO Cluster Operator Status Conditions Overview
By default, an OpenShift ClusterOperator exposes Prometheus metrics via metrics
service.
Operators expose events that can help in debugging issues. To get operator events, run following command:
$ oc get events -n [cluster-operator-namespace]
In an openshift/repo that utilizes openshift/library-go's build-machinery-go, a useful command to list all make targets for a repository is:
$ make help
This builds the binaries:
$ make build
To build the images (or, use make help
to get the target for an individual image):
$ make images
note: issues with imagebuilder and make images
To run unit tests:
$ make test-unit
This will run verify-gofmt, verify-govet, verify-bindata (if applicable), verify-codegen There may be other verify targets added to individual Makefiles, also.
$ make verify
This will update-bindata (if applicable), update-codegen, update-gofmt:
$ make update
If you have exported your KUBECONFIG to point to a running cluster, you can run the end-to-end tests that live in the repository (that would run in CI with e2e-aws-operator)
$ make test-e2e
If the repository utilizes glide for dependency management, you can update dependencies with
$ make update-deps
If the repository uses gomod for dependency management, this doc is useful.
When developing on OpenShift, you'll want to test your changes in a running cluster. Any component image in a release payload can be substituted for a locally built image. You can test your changes in an OpenShift cluster like so:
- If using quay.io, repositories are by default private. Visit quay.io and change settings to public for any new image repos you push. This is to allow OpenShift to pull your image.
The operator deployment is modified to reference a test operand image, rather than modifying the operand deployment directly. This is because an operator is meant to stomp on changes made to it's operand. note: operator and operand may share a repository
For this example, a change to openshift-apiserver
operand is being tested.
- Build the operand image and push it to a public registry (use any containers cli, quay.io, docker.io).
If
make images
doesn't work, your Makefile may need an update, see here
$ cd local/path/to/openshift-apiserver
$ make IMAGE_TAG=quay.io/yourname/openshift-apiserver:test images
$ buildah push quay.io/yourname/openshift-apiserver:test
- Edit the operator deployment definition to reference your test image and build the test operator image.
Each operator (ex openshift/cluster-openshift-apiserver-operator) has a
manifests/*.deployment.yaml
that sets the envIMAGE
for its operand image.- Edit the
manifests/deployment yaml
to reference your test image in
spec.containers.env
like so:
spec: serviceAccountName: openshift-apiserver-operator containers: - name: openshift-apiserver-operator env: - name: IMAGE value: quay.io/yourname/openshift-apiserver:test ---
- Build and push the operator image with the updated deployment above to a public registry (use any containers cli, quay.io, docker.io).
If
make images
doesn't work, your Makefile may need an update, see here
- Edit the
manifests/deployment yaml
to reference your test image in
$ cd local/path/to/cluster-openshift-apiserver-operator
$ make IMAGE_TAG=quay.io/yourname/openshift-apiserver-operator:test images
$ buildah push quay.io/yourname/openshift-apiserver-operator:test
- Disable the CVO or tell CVO to ignore your component. The CVO reconciles a cluster to its known good state as
laid out in its resource definitions. You cannot edit a deployment without first disabling the CVO
(Well, you can, but the CVO will reconcile and stomp on any changes you make).
There are 2 paths to working around CVO, you'll need to either:
- Set your operator in umanaged state. See here for how to patch clusterversion/version object. or
- Scale down CVO and edit a deployment in a running cluster like so:
$ oc scale --replicas 0 -n openshift-cluster-version deployments/cluster-version-operator`
- Edit the ClusterOperator deployment in a running cluster for which you're logged in as admin user.
$ oc get deployments -n openshift-apiserver-operator
$ oc edit deployment openshift-apiserver-operator -n openshift-apiserver-operator
Edit the env OPERATOR_IMAGE
(if it exists), the env IMAGE
, as well as:
spec:
containers:
image: quay.io/yourname/openshift-apiserver-operator:test
env: IMAGE
value: quay.io/yourname/openshift-apiserver:test
env: OPERATOR_IMAGE
value: quay.io/yourname/openshift-apiserver-operator:test
exception: service-ca-operator deployment
env: CONTROLLER_IMAGE
value: quay.io/yourname/service-ca-operator:test
You'll see a new deployment rolls out, and in the operand namespace, openshift-apiserver
, a new deployment
rolls out there as well, using your openshift-apiserver:test
image.
To set your cluster back to its original state, you can simply:
$ oc scale --replicas 1 -n openshift-cluster-version deployments/cluster-version-operator
or remove the overrides section you added in clusterversion/version
.
For this example I'll start with the release image
registry.ci.openshift.org/ocp/release:4.2
and test a change to the github.com/openshift/openshift-apiserver
repository.
- Build the image and push it to a registry (use any containers cli, quay.io, docker.io)
If
make images
doesn't work, your Makefile may need an update, see here
$ cd local/path/to/openshift-apiserver
$ make IMAGE_TAG=quay.io/yourname/openshift-apiserver:test images
$ buildah push quay.io/yourname/openshift-apiserver:test
- Assemble a release payload with your test image and push it to a registry
Get the name of the image (
openshift-apiserver
) you want to substitute:
$ oc adm release info registry.ci.openshift.org/ocp/release:4.2
If the above command fails, you may need to authenticate against registry.ci.openshift.org
.
If you are an OpenShift developer, see authenticating against ci registry
This command will assemble a release payload incorporating your test image and will push it to the quay.io repository.
Be sure to set this repository in quay.io as public
.
$ oc adm release new --from-release registry.ci.openshift.org/ocp/release:4.2 \
openshift-apiserver=quay.io/yourname/openshift-apiserver:test \
--to-image quay.io/yourname/release:test
If the above command succeeds, move on to Step 3.
If the above command fails, you need to authenticate against quay.io
. See authenticating against quay.io
- Extract the installer binary from the release payload that has your test image. This will extract
openshift-install
binary that is pinned to your test release image.
$ oc adm release extract --command openshift-install quay.io/yourname/release:test
- Run the installer extracted from your release image.
$ ./openshift-install create cluster --dir /path/to/installdir
Once the install completes, you'll have a cluster running that was launched with a known-good release payload with whatever test image(s) you've substituted.
make images
utilizes imagebuilder
- There are a few known issues with
imagebuilder
:- openshift/imagebuilder#144
- workaround is to prepull any images that fail with docker or buildah as outlined in the issue
- openshift/imagebuilder#140
- workaround is something like this
- other issues here
- openshift/imagebuilder#144
- Your Makefile may need to update it's
build-image
call, to follow this example - If
make images
still isn't working, you can replace that with a buildah or docker command like so: Dockerfile name varies per repository, this example uses Dockerfile.rhel
$ buildah bud -f Dockerfile.rhel -t quay.io/myimage:local .
or
$ docker build -f Dockerfile.rhel -t quay.io/myimage:local .
For some operator repositories, such as openshift/service-ca-operator, the controller (operand) images are included in the operator image. Any changes to the controllers are made in the operator repository. When testing a change, there is only the single operator image substitution, rather than a separate operand image build plus operator image build for such operators.
(Internal Red Hat registries for developer testing)
- Login at https://oauth-openshift.apps.ci.l2s4.p1.openshiftapps.com/oauth/token/request with your Kerberos ID at Red Hat SSO.
- Once logged in, an API token will be displayed. Please copy it.
- Then login to a
registry.json
file like this
$ podman login --authfile registry.json -u ${KERBEROS_ID} -p ${TOKEN}
Add the necessary credentials to your local ~/.docker/config.json
(or equivalent file) like so:
- Visit
https://try.openshift.com
,GET STARTED
, login w/ RedHat account if not already, choose anyInfrastructure Provider
,Copy Pull Secret
to a local file (or download) - Add the quay auth from the pull-secret to ~/.docker/config.json. The config file should have the following:
$cat ~/.docker/config.json
{
"auths": {
"registry.svc.ci.openshift.org": {
"auth": "blahblahblah"
},
"quay.io": {
"auth": "blahblahblah"
}
}
}