Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically set num shards and replicas from referenced OCP ES #1737

Merged
merged 12 commits into from
Feb 25, 2022

Conversation

pavolloffay
Copy link
Member

@pavolloffay pavolloffay commented Feb 7, 2022

Signed-off-by: Pavol Loffay [email protected]

Resolves #1720

Notable changes:

  • defaulting webhook - set ES number of nodes from the referenced ES instance
  • validating webhook - validate if references ES instance exists
  • ES controller - watch for changes and update ES nodes in Jaeger CR accordingly.

Test data:

kubectl apply -f - <<EOF
apiVersion: logging.openshift.io/v1
kind: Elasticsearch
metadata:
  annotations:
    logging.openshift.io/elasticsearch-cert-management: "true"
    logging.openshift.io/elasticsearch-cert.jaeger-shared-es: user.jaeger
  name: shared-es
spec:
  managementState: Managed
  nodeSpec:
    resources:
      limits:
        memory: 1Gi
      requests:
        cpu: 200m
        memory: 1Gi
  nodes:
    - nodeCount: 1
      proxyResources: {}
      resources: {}
      roles:
        - master
        - client
        - data
      storage: {}
  redundancyPolicy: ZeroRedundancy
EOF

kubectl apply -f - <<EOF
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: tenant1-jaeger
spec:
  strategy: production
  query:
    strategy:
      type: RollingUpdate
  collector:
    strategy:
      type: RollingUpdate
  ingress:
    openshift:
      sar: "" # allow login for all OCP users for testing
  storage:
    type: elasticsearch
    options:
      es:
        index-prefix: tenant1
    elasticsearch:
      name: shared-es
      doNotProvision: true
      nodeCount: 1
      resources:
        requests:
          cpu: 200m
          memory: 1Gi
        limits:
          memory: 1Gi
EOF

var jaegerlog = logf.Log.WithName("jaeger-resource")
var (
jaegerlog = logf.Log.WithName("jaeger-resource")
cl client.Client
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like introducing a global state, but another option would be to implement the webhook "manually", which I like even less for validating and defaulting webhooks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we register another object to handle the webhook instead of Jaeger and embed Jaeger?

Just an idea (untested)
type JaegerWebhook struct {
	Jaeger

	cl client.Client
}

type Jaeger struct {
	...
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that unfortunately does not work. We could add the client to the Jaeger but that is even worse.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the "less worse" option we have, I don't like it either but I don't see any other possibility so far.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I was working on something else I became aware of the custom validator[0].

What do you think about using it? The method calls differ only slightly. In the SetupWebhookWithManager the locale client could be passed to the validator.

func (r *something) SetupWebhookWithManager(mgr ctrl.Manager) error {
	return ctrl.NewWebhookManagedBy(mgr).
		For(r).
		WithValidator(newSomethingValidator(mgr.GetClient())).
		Complete()
}

0: https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/webhook/admission/validator_custom.go#L30-L35

}

// NewElasticsearchReconciler creates a new deployment reconciler controller
func NewElasticsearchReconciler(client client.Client, clientReader client.Reader) *ElasticsearchReconciler {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest I am not sure whether we should use controller or webook to catch changes on the ES CR. I didn't find any good docs when to use webhook over controller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rubenvp8510 any thoughts?

Copy link
Collaborator

@rubenvp8510 rubenvp8510 Feb 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for this case it is OK to use a controller, I haven't found a reference for when to use webhooks or controllers, but I tend to think that webhooks are for validation or mutating the resource, which is not the case here. Also this allow us to reenqueue the reconciliation in case of failure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the controller is fine here, the webhooks should be more used for validation or mutating.

I had to make one more important change. The controller can be installed only if Elasticsearch kind is installed in the cluster (OpenShift Elasticserach operator).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this is already covered in docs

https://docs.openshift.com/container-platform/4.9/distr_tracing/distr_tracing_install/distr-tracing-installing.html#distr-tracing-jaeger-operator-install_install-distributed-tracing

If you require persistent storage, you must also install the OpenShift Elasticsearch Operator before installing the Red Hat OpenShift distributed tracing platform Operator.

cc) @JStickler

@codecov
Copy link

codecov bot commented Feb 9, 2022

Codecov Report

Merging #1737 (1d33758) into main (9a4630a) will increase coverage by 0.13%.
The diff coverage is 87.01%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1737      +/-   ##
==========================================
+ Coverage   87.53%   87.67%   +0.13%     
==========================================
  Files          99      100       +1     
  Lines        5946     6003      +57     
==========================================
+ Hits         5205     5263      +58     
+ Misses        567      563       -4     
- Partials      174      177       +3     
Impacted Files Coverage Δ
apis/v1/jaeger_types.go 87.50% <ø> (ø)
pkg/storage/elasticsearch.go 80.28% <ø> (-0.68%) ⬇️
pkg/upgrade/v1_31_0.go 18.18% <0.00%> (ø)
...ntroller/elasticsearch/elasticsearch_controller.go 83.33% <83.33%> (ø)
apis/v1/jaeger_webhook.go 77.77% <89.28%> (+77.77%) ⬆️
pkg/autodetect/main.go 87.19% <100.00%> (-0.16%) ⬇️
pkg/metrics/instances.go 89.36% <100.00%> (ø)
pkg/strategy/controller.go 94.50% <100.00%> (ø)
pkg/strategy/production.go 100.00% <100.00%> (ø)
pkg/strategy/streaming.go 98.55% <100.00%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9a4630a...1d33758. Read the comment docs.

@pavolloffay pavolloffay changed the title Automatically set num shards and replicas from references OCP ES Automatically set num shards and replicas from referenced OCP ES Feb 10, 2022
}

// ShouldInjectOpenShiftElasticsearchConfiguration returns true if OpenShift Elasticsearch is used and its configuration should be used.
func ShouldInjectOpenShiftElasticsearchConfiguration(s JaegerStorageSpec) bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between this function and https://github.com/jaegertracing/jaeger-operator/blob/master/pkg/storage/elasticsearch.go#L26 ? Can we use the one we already have?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The referenced one moved here to the v1 package.

@pavolloffay pavolloffay reopened this Feb 14, 2022
@pavolloffay pavolloffay force-pushed the es-controller branch 2 times, most recently from 4ab1501 to deb93ad Compare February 21, 2022 12:11
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
@pavolloffay
Copy link
Member Author

@rubenvp8510 could you please review?

Copy link
Collaborator

@rubenvp8510 rubenvp8510 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Watch ES CR for changes and corretly set number of shards and replicas
3 participants