Skip to content

Commit

Permalink
feat: RunnerSet backed by StatefulSet (#629)
Browse files Browse the repository at this point in the history
* feat: RunnerSet backed by StatefulSet

Unlike a runner deployment, a runner set can manage a set of stateful runners by combining a statefulset and an admission webhook that mutates statefulset-managed pods with required envvars and registration tokens.

Resolves #613
Ref #612

* Upgrade controller-runtime to 0.9.0

* Bump Go to 1.16.x following controller-runtime 0.9.0

* Upgrade kubebuilder to 2.3.2 for updated etcd and apiserver following local setup

* Fix startup failure due to missing LeaderElectionID

* Fix the issue that any pods become unable to start once actions-runner-controller got failed after the mutating webhook has been registered

* Allow force-updating statefulset

* Fix runner container missing work and certs-client volume mounts and DOCKER_HOST and DOCKER_TLS_VERIFY envvars when dockerdWithinRunner=false

* Fix runnerset-controller not applying statefulset.spec.template.spec changes when there were no changes in runnerset spec

* Enable running acceptance tests against arbitrary kind cluster

* RunnerSet supports non-ephemeral runners only today

* fix: docker-build from root Makefile on intel mac

* fix: arch check fixes for mac and ARM

* ci: aligning test data format and patching checks

* fix: removing namespace in test data

* chore: adding more ignores

* chore: removing leading space in shebang

* Re-add metrics to org hra testdata

* Bump cert-manager to v1.1.1 and fix deploy.sh

Co-authored-by: toast-gear <[email protected]>
Co-authored-by: Callum James Tait <[email protected]>
  • Loading branch information
3 people authored Jun 22, 2021
1 parent af0ca03 commit 9e4dbf4
Show file tree
Hide file tree
Showing 54 changed files with 27,387 additions and 9,708 deletions.
1 change: 1 addition & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ config
charts
.github
.envrc
.env
*.md
*.txt
*.sh
10 changes: 7 additions & 3 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,15 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v2
- uses: actions/setup-go@v2
with:
go-version: '^1.16.5'
- run: go version
- name: Install kubebuilder
run: |
curl -L -O https://github.com/kubernetes-sigs/kubebuilder/releases/download/v2.2.0/kubebuilder_2.2.0_linux_amd64.tar.gz
tar zxvf kubebuilder_2.2.0_linux_amd64.tar.gz
sudo mv kubebuilder_2.2.0_linux_amd64 /usr/local/kubebuilder
curl -L -O https://github.com/kubernetes-sigs/kubebuilder/releases/download/v2.3.2/kubebuilder_2.3.2_linux_amd64.tar.gz
tar zxvf kubebuilder_2.3.2_linux_amd64.tar.gz
sudo mv kubebuilder_2.3.2_linux_amd64 /usr/local/kubebuilder
- name: Run tests
run: make test
- name: Verify manifests are up-to-date
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Deploy Assets
release

# Binaries for programs and plugins
Expand All @@ -15,7 +16,6 @@ bin
*.out

# Kubernetes Generated files - skip generated files, except for vendored files

!vendor/**/zz_generated.*

# editor and IDE paraphernalia
Expand All @@ -25,6 +25,7 @@ bin
*~

.envrc
.env
*.pem

# OS
Expand Down
52 changes: 33 additions & 19 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,20 @@ TEST_REPO ?= ${DOCKER_USER}/actions-runner-controller
TEST_ORG ?=
TEST_ORG_REPO ?=
SYNC_PERIOD ?= 5m
USE_RUNNERSET ?=
KUBECONTEXT ?= kind-acceptance
CLUSTER ?= acceptance
CERT_MANAGER_VERSION ?= v1.1.1

# From https://github.com/VictoriaMetrics/operator/pull/44
YAML_DROP=$(YQ) delete --inplace
YAML_DROP_PREFIX=spec.validation.openAPIV3Schema.properties.spec.properties

# If you encounter errors like the below, you are very likely to update this to follow e.g. CRD version change:
# CustomResourceDefinition.apiextensions.k8s.io "runners.actions.summerwind.dev" is invalid: spec.preserveUnknownFields: Invalid value: true: must be false in order to use defaults in the schema
YAML_DROP_PREFIX=spec.versions[0].schema.openAPIV3Schema.properties.spec.properties

# Produce CRDs that work back to Kubernetes 1.11 (no version conversion)
CRD_OPTIONS ?= "crd:trivialVersions=true"
CRD_OPTIONS ?= "crd:trivialVersions=true,generateEmbeddedObjectMeta=true"

# Get the currently used golang install path (in GOPATH/bin, unless GOBIN is set)
ifeq (,$(shell go env GOBIN))
Expand Down Expand Up @@ -158,31 +165,31 @@ acceptance: release/clean acceptance/pull docker-build release
acceptance/run: acceptance/kind acceptance/load acceptance/setup acceptance/deploy acceptance/tests acceptance/teardown

acceptance/kind:
kind create cluster --name acceptance --config acceptance/kind.yaml
kind create cluster --name ${CLUSTER} --config acceptance/kind.yaml

# Set TMPDIR to somewhere under $HOME when you use docker installed with Ubuntu snap
# Otherwise `load docker-image` fail while running `docker save`.
# See https://kind.sigs.k8s.io/docs/user/known-issues/#docker-installed-with-snap
acceptance/load:
kind load docker-image ${NAME}:${VERSION} --name acceptance
kind load docker-image quay.io/brancz/kube-rbac-proxy:v0.10.0 --name acceptance
kind load docker-image ${RUNNER_NAME}:${RUNNER_TAG} --name acceptance
kind load docker-image docker:dind --name acceptance
kind load docker-image quay.io/jetstack/cert-manager-controller:v1.0.4 --name acceptance
kind load docker-image quay.io/jetstack/cert-manager-cainjector:v1.0.4 --name acceptance
kind load docker-image quay.io/jetstack/cert-manager-webhook:v1.0.4 --name acceptance
kubectl cluster-info --context kind-acceptance
kind load docker-image ${NAME}:${VERSION} --name ${CLUSTER}
kind load docker-image quay.io/brancz/kube-rbac-proxy:v0.10.0 --name ${CLUSTER}
kind load docker-image ${RUNNER_NAME}:${RUNNER_TAG} --name ${CLUSTER}
kind load docker-image docker:dind --name ${CLUSTER}
kind load docker-image quay.io/jetstack/cert-manager-controller:$(CERT_MANAGER_VERSION) --name ${CLUSTER}
kind load docker-image quay.io/jetstack/cert-manager-cainjector:$(CERT_MANAGER_VERSION) --name ${CLUSTER}
kind load docker-image quay.io/jetstack/cert-manager-webhook:$(CERT_MANAGER_VERSION) --name ${CLUSTER}
kubectl cluster-info --context ${KUBECONTEXT}

# Pull the docker images for acceptance
acceptance/pull:
docker pull quay.io/brancz/kube-rbac-proxy:v0.10.0
docker pull docker:dind
docker pull quay.io/jetstack/cert-manager-controller:v1.0.4
docker pull quay.io/jetstack/cert-manager-cainjector:v1.0.4
docker pull quay.io/jetstack/cert-manager-webhook:v1.0.4
docker pull quay.io/jetstack/cert-manager-controller:$(CERT_MANAGER_VERSION)
docker pull quay.io/jetstack/cert-manager-cainjector:$(CERT_MANAGER_VERSION)
docker pull quay.io/jetstack/cert-manager-webhook:$(CERT_MANAGER_VERSION)

acceptance/setup:
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.4/cert-manager.yaml #kubectl create namespace actions-runner-system
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/$(CERT_MANAGER_VERSION)/cert-manager.yaml #kubectl create namespace actions-runner-system
kubectl -n cert-manager wait deploy/cert-manager-cainjector --for condition=available --timeout 90s
kubectl -n cert-manager wait deploy/cert-manager-webhook --for condition=available --timeout 60s
kubectl -n cert-manager wait deploy/cert-manager --for condition=available --timeout 60s
Expand All @@ -191,11 +198,12 @@ acceptance/setup:
sleep 5

acceptance/teardown:
kind delete cluster --name acceptance
kind delete cluster --name ${CLUSTER}

acceptance/deploy:
NAME=${NAME} DOCKER_USER=${DOCKER_USER} VERSION=${VERSION} RUNNER_NAME=${RUNNER_NAME} RUNNER_TAG=${RUNNER_TAG} TEST_REPO=${TEST_REPO} \
TEST_ORG=${TEST_ORG} TEST_ORG_REPO=${TEST_ORG_REPO} SYNC_PERIOD=${SYNC_PERIOD} \
USE_RUNNERSET=${USE_RUNNERSET} \
acceptance/deploy.sh

acceptance/tests:
Expand All @@ -205,8 +213,14 @@ acceptance/tests:
github-release: release
ghr ${VERSION} release/

# find or download controller-gen
# download controller-gen if necessary
# Find or download controller-gen
#
# Note that controller-gen newer than 0.4.1 is needed for https://github.com/kubernetes-sigs/controller-tools/issues/444#issuecomment-680168439
# Otherwise we get errors like the below:
# Error: failed to install CRD crds/actions.summerwind.dev_runnersets.yaml: CustomResourceDefinition.apiextensions.k8s.io "runnersets.actions.summerwind.dev" is invalid: [spec.validation.openAPIV3Schema.properties[spec].properties[template].properties[spec].properties[containers].items.properties[ports].items.properties[protocol].default: Required value: this property is in x-kubernetes-list-map-keys, so it must have a default or be a required property, spec.validation.openAPIV3Schema.properties[spec].properties[template].properties[spec].properties[initContainers].items.properties[ports].items.properties[protocol].default: Required value: this property is in x-kubernetes-list-map-keys, so it must have a default or be a required property]
#
# Note that controller-gen newer than 0.6.0 is needed due to https://github.com/kubernetes-sigs/controller-tools/issues/448
# Otherwise ObjectMeta embedded in Spec results in empty on the storage.
controller-gen:
ifeq (, $(shell which controller-gen))
ifeq (, $(wildcard $(GOBIN)/controller-gen))
Expand All @@ -215,7 +229,7 @@ ifeq (, $(wildcard $(GOBIN)/controller-gen))
CONTROLLER_GEN_TMP_DIR=$$(mktemp -d) ;\
cd $$CONTROLLER_GEN_TMP_DIR ;\
go mod init tmp ;\
go get sigs.k8s.io/controller-tools/cmd/controller-gen@v0.3.0 ;\
go get sigs.k8s.io/controller-tools/cmd/controller-gen@v0.6.0 ;\
rm -rf $$CONTROLLER_GEN_TMP_DIR ;\
}
endif
Expand Down
106 changes: 79 additions & 27 deletions acceptance/checks.sh
Original file line number Diff line number Diff line change
@@ -1,32 +1,84 @@
#!/usr/bin/env bash

set -e
set +e

runner_name=
repo_runnerdeployment_passed="skipped"
repo_runnerset_passed="skipped"

while [ -z "${runner_name}" ]; do
echo Finding the runner... 1>&2
sleep 1
runner_name=$(kubectl get runner --output=jsonpath="{.items[*].metadata.name}")
done
echo "Checking if RunnerDeployment repo test is set"
if [ "${TEST_REPO}" ] && [ ! "${USE_RUNNERSET}" ]; then
runner_name=
count=0
while [ $count -le 30 ]; do
echo "Finding Runner ..."
runner_name=$(kubectl get runner --output=jsonpath="{.items[*].metadata.name}")
if [ "${runner_name}" ]; then
while [ $count -le 30 ]; do
runner_pod_name=
echo "Found Runner \""${runner_name}"\""
echo "Finding underlying pod ..."
runner_pod_name=$(kubectl get pod --output=jsonpath="{.items[*].metadata.name}" | grep ${runner_name})
if [ "${runner_pod_name}" ]; then
echo "Found underlying pod \""${runner_pod_name}"\""
echo "Waiting for pod \""${runner_pod_name}"\" to become ready..."
kubectl wait pod/${runner_pod_name} --for condition=ready --timeout 270s
break 2
fi
sleep 1
let "count=count+1"
done
fi
sleep 1
let "count=count+1"
done
if [ $count -ge 30 ]; then
repo_runnerdeployment_passed=false
else
repo_runnerdeployment_passed=true
fi
echo "Checking if RunnerSet repo test is set"
elif [ "${TEST_REPO}" ] && [ "${USE_RUNNERSET}" ]; then
runnerset_name=
count=0
while [ $count -le 30 ]; do
echo "Finding RunnerSet ..."
runnerset_name=$(kubectl get runnerset --output=jsonpath="{.items[*].metadata.name}")
if [ "${runnerset_name}" ]; then
while [ $count -le 30 ]; do
runnerset_pod_name=
echo "Found RunnerSet \""${runnerset_name}"\""
echo "Finding underlying pod ..."
runnerset_pod_name=$(kubectl get pod --output=jsonpath="{.items[*].metadata.name}" | grep ${runnerset_name})
echo "BEFORE IF"
if [ "${runnerset_pod_name}" ]; then
echo "AFTER IF"
echo "Found underlying pod \""${runnerset_pod_name}"\""
echo "Waiting for pod \""${runnerset_pod_name}"\" to become ready..."
kubectl wait pod/${runnerset_pod_name} --for condition=ready --timeout 270s
break 2
fi
sleep 1
let "count=count+1"
done
fi
sleep 1
let "count=count+1"
done
if [ $count -ge 30 ]; then
repo_runnerset_passed=false
else
repo_runnerset_passed=true
fi
fi

echo Found runner ${runner_name}.

# Wait a bit to make sure the runner pod is created before looking for it.
sleep 2

pod_name=

while [ -z "${pod_name}" ]; do
echo Finding the runner pod... 1>&2
sleep 1
pod_name=$(kubectl get pod --output=jsonpath="{.items[*].metadata.name}" | grep ${runner_name})
done

echo Found pod ${pod_name}.

echo Waiting for pod ${runner_name} to become ready... 1>&2

kubectl wait pod/${runner_name} --for condition=ready --timeout 270s

echo All tests passed. 1>&2
if [ ${repo_runnerset_passed} == true ] || [ ${repo_runnerset_passed} == "skipped" ] && \
[ ${repo_runnerdeployment_passed} == true ] || [ ${repo_runnerdeployment_passed} == "skipped" ]; then
echo "INFO : All tests passed or skipped"
echo "RunnerSet Repo Test Status : ${repo_runnerset_passed}"
echo "RunnerDeployment Repo Test Status : ${repo_runnerdeployment_passed}"
else
echo "ERROR : Some tests failed"
echo "RunnerSet Repo Test Status : ${repo_runnerset_passed}"
echo "RunnerDeployment Repo Test Status : ${repo_runnerdeployment_passed}"
exit 1
fi
10 changes: 8 additions & 2 deletions acceptance/deploy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ if [ "${tool}" == "helm" ]; then
--set image.repository=${NAME} \
--set image.tag=${VERSION} \
-f ${VALUES_FILE}
kubectl apply -f charts/actions-runner-controller/crds
kubectl -n actions-runner-system wait deploy/actions-runner-controller --for condition=available --timeout 60s
else
kubectl apply \
Expand All @@ -47,8 +48,13 @@ fi
sleep 20

if [ -n "${TEST_REPO}" ]; then
cat acceptance/testdata/runnerdeploy.yaml | envsubst | kubectl apply -f -
cat acceptance/testdata/hra.yaml | envsubst | kubectl apply -f -
if [ -n "USE_RUNNERSET" ]; then
cat acceptance/testdata/repo.runnerset.yaml | envsubst | kubectl apply -f -
else
echo 'Deploying runnerdeployment and hra. Set USE_RUNNERSET if you want to deploy runnerset instead.'
cat acceptance/testdata/repo.runnerdeploy.yaml | envsubst | kubectl apply -f -
cat acceptance/testdata/repo.hra.yaml | envsubst | kubectl apply -f -
fi
else
echo 'Skipped deploying runnerdeployment and hra. Set TEST_REPO to "yourorg/yourrepo" to deploy.'
fi
Expand Down
1 change: 1 addition & 0 deletions acceptance/testdata/org.hra.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ spec:
minReplicas: 0
minReplicas: 0
maxReplicas: 5
# Used to test that HRA is working for org runners
metrics:
- type: PercentageRunnersBusy
scaleUpThreshold: '0.75'
Expand Down
File renamed without changes.
File renamed without changes.
51 changes: 51 additions & 0 deletions acceptance/testdata/repo.runnerset.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerSet
metadata:
name: example-runnerset
spec:
# MANDATORY because it is based on StatefulSet: Results in a below error when omitted:
# missing required field "selector" in dev.summerwind.actions.v1alpha1.RunnerSet.spec
selector:
matchLabels:
app: example-runnerset

# MANDATORY because it is based on StatefulSet: Results in a below error when omitted:
# missing required field "serviceName" in dev.summerwind.actions.v1alpha1.RunnerSet.spec]
serviceName: example-runnerset

#replicas: 1
ephemeral: false

repository: ${TEST_REPO}
#
# Custom runner image
#
image: ${RUNNER_NAME}:${RUNNER_TAG}
#
# dockerd within runner container
#
## Replace `mumoshu/actions-runner-dind:dev` with your dind image
#dockerdWithinRunnerContainer: true
#
# Set the MTU used by dockerd-managed network interfaces (including docker-build-ubuntu)
#
#dockerMTU: 1450
#Runner group
# labels:
# - "mylabel 1"
# - "mylabel 2"

#
# Non-standard working directory
#
# workDir: "/"
template:
metadata:
labels:
app: example-runnerset
spec:
containers:
- name: runner
imagePullPolicy: IfNotPresent
#- name: docker
# #image: mumoshu/actions-runner-dind:dev
Loading

0 comments on commit 9e4dbf4

Please sign in to comment.