Support specifying nodeSelector and affinity values in SmbCommonConfig to control pod scheduling #291

phlogistonjohn · 2023-03-01T22:06:37Z

Fixes: #283

The new podSettings key allows for control of certain parameters the operator can not or should not guess at. Currently, this allows one to control scheduling via nodeSelector and affinity sections under podSettings in SmbCommonConfig.

It's located in SmbCommonConfig as SmbShare is really supposed to be about the share and less about the server. It's similar to how configuration of network integration is done by SmbCommonConfig. It's certainly not for SmbSecurityConfig :-)

I also added some extra labels to the smb pods so that they can be quickly backtracked to the smbcommonconfig and SmbSecurityConfig that were used to generate them. These labels helped my write the test cases.

phlogistonjohn · 2023-03-02T14:10:53Z

Right. These tests probably don't make any sense on a single node setup. I need to think about how to handle that case.

phlogistonjohn · 2023-03-02T22:48:19Z

Erm.

            --- PASS: TestIntegration/groupedShares/clustered/TestPodsReady (0.02s)
    --- PASS: TestIntegration/scheduling (51.03s)
        --- PASS: TestIntegration/scheduling/NodeSelectorSuite (24.09s)
            --- PASS: TestIntegration/scheduling/NodeSelectorSuite/TestPodsRunOnLabeledNode (23.93s)
        --- PASS: TestIntegration/scheduling/AffinityBasedSelectorSuite (26.94s)
            --- PASS: TestIntegration/scheduling/AffinityBasedSelectorSuite/TestPodsRunOnLabeledNode (26.75s)
PASS
ok  	github.com/samba-in-kubernetes/samba-operator/tests/integration	1607.647s
yq not found in PATH, checking /root/samba-operator/.bin
controller-gen not found in PATH, checking /root/samba-operator/.bin
/root/samba-operator/.bin/controller-gen "crd:trivialVersions=true,crdVersions=v1" rbac:roleName=manager-role webhook \
	paths="./..." output:crd:artifacts:config=config/crd/bases
YQ=/root/samba-operator/.bin/yq /root/samba-operator/hack/yq-fixup-yamls.sh  /root/samba-operator/config
kustomize not found in PATH, checking /root/samba-operator/.bin
/root/samba-operator/.bin/kustomize build config/default | kubectl delete -f -
namespace "samba-operator-system" deleted
customresourcedefinition.apiextensions.k8s.io "smbcommonconfigs.samba-operator.samba.org" deleted
customresourcedefinition.apiextensions.k8s.io "smbsecurityconfigs.samba-operator.samba.org" deleted
customresourcedefinition.apiextensions.k8s.io "smbshares.samba-operator.samba.org" deleted
role.rbac.authorization.k8s.io "samba-operator-leader-election-role" deleted
clusterrole.rbac.authorization.k8s.io "samba-operator-manager-role" deleted
clusterrole.rbac.authorization.k8s.io "samba-operator-metrics-reader" deleted
clusterrole.rbac.authorization.k8s.io "samba-operator-proxy-role" deleted
rolebinding.rbac.authorization.k8s.io "samba-operator-leader-election-rolebinding" deleted
clusterrolebinding.rbac.authorization.k8s.io "samba-operator-manager-rolebinding" deleted
clusterrolebinding.rbac.authorization.k8s.io "samba-operator-proxy-rolebinding" deleted
configmap "samba-operator-controller-cfg" deleted
service "samba-operator-controller-manager-metrics-service" deleted
deployment.apps "samba-operator-controller-manager" deleted
* Deleting "minikube" in kvm2 ...
* Deleting "minikube-m02" in kvm2 ...
* Deleting "minikube-m03" in kvm2 ...
* Removed all traces of the "minikube" cluster.
time="2023-03-02T22:35:46Z" level=fatal msg="Unable to delete registry-samba.apps.ocp.cloud.ci.centos.org/sink/samba-operator:ci-k8s-1.26-pr291. Image may not exist or is not stored with a v2 Schema in a v2 registry"

Our tests passed, but the minikube delete command must have failed?

I'm marking this as ready for review and kicking off another ci run but if this keeps happening I'll ask reviewers to focus on the actual Go tests status rather than the overall ci state.

phlogistonjohn · 2023-03-03T14:12:04Z

/test centos-ci/sink-clustered/mini-k8s-1.26

anoopcs9 · 2023-03-06T06:57:42Z

Erm.

* Deleting "minikube" in kvm2 ...
* Deleting "minikube-m02" in kvm2 ...
* Deleting "minikube-m03" in kvm2 ...
* Removed all traces of the "minikube" cluster.
time="2023-03-02T22:35:46Z" level=fatal msg="Unable to delete registry-samba.apps.ocp.cloud.ci.centos.org/sink/samba-operator:ci-k8s-1.26-pr291. Image may not exist or is not stored with a v2 Schema in a v2 registry"

Our tests passed, but the minikube delete command must have failed?

I think its the following skopeo command from job script that failed in the above scenario:

skopeo delete "docker://${CI_IMG_OP}"

I can see that the job was triggered within a span of 20 minutes. Image tagging is differentiated by just PR numbers. 1st run completed successfully and the 2nd run fails to find the image for deletion afterwards.

phlogistonjohn · 2023-03-27T19:33:36Z

@Mergifyio rebase

mergify · 2023-03-27T19:33:46Z

rebase

✅ Branch has been successfully rebased

anoopcs9

See below for a typo, remark and clarification.

docs/resources/SmbCommonConfig.md

internal/resources/statefulsets.go

tests/integration/scheduling_test.go

anoopcs9

lgtm, cool work.

phlogistonjohn · 2023-04-02T16:12:05Z

@Mergifyio rebase

Add labels for "samba-operator.samba.org/common-config-from" and "samba-operator.samba.org/security-config-from" when the server instance is derived from a share using common config or security config CRs. The values of the labels map back to the names of the resources. This can be handy for quickly seeing what resources came from where and when writing tests that need to associate a pod with a CR resource beyond just the SmbShare name. Signed-off-by: John Mulligan <[email protected]>

Signed-off-by: John Mulligan <[email protected]>

A node selector value found in the common config takes priority over that of the operator config or hard coded default. Signed-off-by: John Mulligan <[email protected]>

This will allow users & admins to customize pod affinity settings for the local cluster's needs. Signed-off-by: John Mulligan <[email protected]>

Signed-off-by: John Mulligan <[email protected]>

This patch includes the logic to add the anti-affinity rule that you typically want for ctdb clustered smbds at the end of the rules pulled from the common config. Signed-off-by: John Mulligan <[email protected]>

If not every test run will always use the same pattern, possibly defeating the purpose of using random choice in the first place. Signed-off-by: John Mulligan <[email protected]>

Add some tests that check that node selector values and/or affinity rules are passed on from a SmbCommonConfig to the pods that are created. Signed-off-by: John Mulligan <[email protected]>

The new podSettings key allows for control of certain parameters the operator can not guess at. Currently, this allows one to control scheduling via nodeSelector and affinity sections. Signed-off-by: John Mulligan <[email protected]>

Add yet another environment variable to inform the test suite about the outer world. The variable SMBOP_TEST_MIN_NODE_COUNT should be set to the number of nodes that typical pods may be scheduled on, the tests may then use this value to determine if the cluster is in-spec or out-of-spec for tests that require a certain number of nodes. Signed-off-by: John Mulligan <[email protected]>

The tests for the node selector and affinity scheduling require at least two nodes. Previously, the tests would simply fail if there weren't enough nodes to run the test. This patch adds a check such that the test is skipped if there aren't enough nodes - unless the SMBOP_TEST_MIN_NODE_COUNT was specified. This variable acts as a double check on the expected environment. If the number of available nodes is less than that given by the env var the test is required to fail. Signed-off-by: John Mulligan <[email protected]>

In the centosci based testing environment we have an expected number of worker nodes we can pass to the test suite. This can then be used by the suite to determine if the test cluster has been set up according to intended spec. Signed-off-by: John Mulligan <[email protected]>

mergify · 2023-04-02T16:12:09Z

rebase

✅ Branch has been successfully rebased

phlogistonjohn · 2023-04-02T16:29:25Z

WHY did it dismiss the existing review. There were no conflicts and the rebase was done via mergify.
Errgh. @anoopcs9 can you please take another look? Thanks!

phlogistonjohn mentioned this pull request Mar 1, 2023

How to configure node selector if using a mixed K8S #283

Closed

phlogistonjohn force-pushed the jjm-cc-pod-conf branch from f54d95a to 77f14df Compare March 2, 2023 21:53

phlogistonjohn marked this pull request as ready for review March 2, 2023 22:48

phlogistonjohn requested review from anoopcs9, synarete and obnoxxx March 2, 2023 22:48

phlogistonjohn force-pushed the jjm-cc-pod-conf branch from 77f14df to 9e7cb54 Compare March 27, 2023 19:33

anoopcs9 requested changes Mar 28, 2023

View reviewed changes

docs/resources/SmbCommonConfig.md Outdated Show resolved Hide resolved

internal/resources/statefulsets.go Show resolved Hide resolved

tests/integration/scheduling_test.go Show resolved Hide resolved

phlogistonjohn force-pushed the jjm-cc-pod-conf branch from 9e7cb54 to e8dfbaf Compare March 30, 2023 18:00

anoopcs9 previously approved these changes Mar 31, 2023

View reviewed changes

phlogistonjohn added the priority-review This PR deserves a look label Mar 31, 2023

phlogistonjohn added 12 commits April 2, 2023 16:12

api: add podSettings to SmbCommonConfig

8d4a4cd

Signed-off-by: John Mulligan <[email protected]>

api: update generated files for new podSettings in SmbCommonConfig

35334c7

Signed-off-by: John Mulligan <[email protected]>

planner: take node selector value from common config if possible

2a29ad4

A node selector value found in the common config takes priority over that of the operator config or hard coded default. Signed-off-by: John Mulligan <[email protected]>

api: add affinity field to SmbCommonConfigPodSettings type

631bb0e

This will allow users & admins to customize pod affinity settings for the local cluster's needs. Signed-off-by: John Mulligan <[email protected]>

api: commit generated files after adding affinity field

965ce13

Signed-off-by: John Mulligan <[email protected]>

resources: add support for setting affinity on pods in deployments

22bc7d2

Signed-off-by: John Mulligan <[email protected]>

resources: add support for setting affinity for pods in statefulsets

22821ae

This patch includes the logic to add the anti-affinity rule that you typically want for ctdb clustered smbds at the end of the rules pulled from the common config. Signed-off-by: John Mulligan <[email protected]>

integration tests: make sure tests using math/rand are seeded

9b4342d

If not every test run will always use the same pattern, possibly defeating the purpose of using random choice in the first place. Signed-off-by: John Mulligan <[email protected]>

integration tests: add test cases for nodeselector and affinity rules

eee80af

Add some tests that check that node selector values and/or affinity rules are passed on from a SmbCommonConfig to the pods that are created. Signed-off-by: John Mulligan <[email protected]>

phlogistonjohn added 2 commits April 2, 2023 16:12

phlogistonjohn dismissed anoopcs9’s stale review via 00a4106 April 2, 2023 16:12

phlogistonjohn force-pushed the jjm-cc-pod-conf branch from e8dfbaf to 00a4106 Compare April 2, 2023 16:12

anoopcs9 approved these changes Apr 3, 2023

View reviewed changes

mergify bot merged commit f9b9853 into samba-in-kubernetes:master Apr 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support specifying nodeSelector and affinity values in SmbCommonConfig to control pod scheduling #291

Support specifying nodeSelector and affinity values in SmbCommonConfig to control pod scheduling #291

phlogistonjohn commented Mar 1, 2023

phlogistonjohn commented Mar 2, 2023

phlogistonjohn commented Mar 2, 2023

phlogistonjohn commented Mar 3, 2023

anoopcs9 commented Mar 6, 2023

phlogistonjohn commented Mar 27, 2023

mergify bot commented Mar 27, 2023

anoopcs9 left a comment

anoopcs9 left a comment

phlogistonjohn commented Apr 2, 2023

mergify bot commented Apr 2, 2023

phlogistonjohn commented Apr 2, 2023

Support specifying nodeSelector and affinity values in SmbCommonConfig to control pod scheduling #291

Support specifying nodeSelector and affinity values in SmbCommonConfig to control pod scheduling #291

Conversation

phlogistonjohn commented Mar 1, 2023

phlogistonjohn commented Mar 2, 2023

phlogistonjohn commented Mar 2, 2023

phlogistonjohn commented Mar 3, 2023

anoopcs9 commented Mar 6, 2023

phlogistonjohn commented Mar 27, 2023

mergify bot commented Mar 27, 2023

✅ Branch has been successfully rebased

anoopcs9 left a comment

Choose a reason for hiding this comment

anoopcs9 left a comment

Choose a reason for hiding this comment

phlogistonjohn commented Apr 2, 2023

mergify bot commented Apr 2, 2023

✅ Branch has been successfully rebased

phlogistonjohn commented Apr 2, 2023