Add missing scale subresource status in order to use an HPA resource over an OpenTelemetryCollector CR #775

secat · 2022-03-16T02:04:20Z

The OpenTelemetryCollector has a configuration field named replicas (see openTelemetryCollector.spec.replicas). However it lacks some status info in order to configure an HPA on a it (see scale subresource documentation) using the defined scale subresource in the OpenTelemetryCollector CRD.

Suggestion

I suggest to update the OpenTelemetryCollector CRD status with:

// ScaleSubresourceStatus defines the observed state of the OpenTelemetryCollector's
// scale subresource.
type ScaleSubresourceStatus struct {
	// The total number of non-terminated pods targeted by this
	// OpenTelemetryCollector's deployment or statefulSet.
	// +optional
	Replicas int32 `json:"replicas,omitempty"`

	// The selector used to match the OpenTelemetryCollector's
	// deployment or statefulSet pods.
	// +optional
	Selector string `json:"selector,omitempty"`
}

// OpenTelemetryCollectorStatus defines the observed state of OpenTelemetryCollector.
type OpenTelemetryCollectorStatus struct {
	[...]

	// Scale is the OpenTelemetryCollector's scale subresource status.
	// +optional
	Scale ScaleSubresourceStatus `json:"scale,omitempty"`

	[...]
}

// +kubebuilder:object:root=true
// +kubebuilder:resource:shortName=otelcol;otelcols
// +kubebuilder:subresource:status 
// +kubebuilder:subresource:scale:specpath=.spec.replicas,statuspath=.status.scale.replicas,selectorpath=.status.scale.selector
[...]

NOTE: The work that was done in PR #746 doesn't fulfill our needs. We need to scale based on the memory consumption. We may also in the future configure scaling based on a custom metric. We also want to configure our own desired scaling behavior.

The text was updated successfully, but these errors were encountered:

secat · 2022-03-16T02:05:08Z

@jpkrohling I would be available to contribute a PR for this issue. Thank you in advance.

jpkrohling · 2022-03-16T12:01:56Z

It's yours!

pavolloffay · 2022-03-21T18:14:01Z

@secat could you please explain your use-case and how this feature is different to #746 which added MaxReplicas field in the CR and it configures HPA for collector deployment if that field is used?

Could you please also explain how the /scale subresource is used (e.g. by HPA) or in your use-case? Do you create additional k8s objects to make use of it?

secat · 2022-03-21T23:58:31Z

@pavolloffay as described in the note in the main description of this issue, the current implementation in #746 doesn't fulfill our needs since it creates a v1 HPA based on CPU usage only without other configuration knobs. It also doesn't provide any advanced HPA v2beta1 configurations.

We want to scale short/medium term based on Memory. Long term, we want to scale on a custom metric. We want also to configure our own scaling behavior. This is not possible with the current implementation and with the only configuration knob called MaxReplicas.

We have a meta controller that creates the OpenTelemetryCollector with a vb2beta2 HPA resource that is configured on top of the OpenTelemetryCollector. This HPA will control the OpenTelemetryCollector resource's spec.replicas configuration field and use the status.scale.replicas and status.scale.collector (see scale subresource documentation).

Also short/medium term we want to use a deployment for the otel collector, but we may in the future use a statefulSet if we start using wal. The external HPA will control the replicas field which in turns will control the deployment or statefulSet.

pavolloffay · 2022-03-22T09:39:56Z

thanks for the explanation @secat. If you could share your HPA configuration (v2beta1) here as well that would be helpful (also for other users) to provide a complete story/guide.

secat · 2022-03-22T11:19:52Z

@pavolloffay here is an example of my current HPA v2beta2 configuration:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  creationTimestamp: "2022-03-22T11:11:56Z"
  labels:
    app.kubernetes.io/component: collector
    app.kubernetes.io/instance: tracingcollectorendpoint-collector-857851ca
    app.kubernetes.io/managed-by: tracing-operator
    app.kubernetes.io/name: tce-scatudal-local-lab
    app.kubernetes.io/part-of: tracingcollectorendpoint
  name: tracingcollectorendpoint-collector-857851ca
  namespace: scatudal-local-lab
  ownerReferences:
  - apiVersion: tracing.observability.harbour.ubisoft.com/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: TracingCollectorEndpoint
    name: tce-scatudal-local-lab
    uid: c94e65b9-fb7c-420e-94c2-50f7dfd5e0e3
  resourceVersion: "266206"
  uid: dcc685bd-56e2-4a22-a0e4-cfb8a0bc9c2a
spec:
  maxReplicas: 10
  metrics:
  - resource:
      name: memory
      target:
        averageUtilization: 70
        type: Utilization
    type: Resource
  - resource:
      name: cpu
      target:
        averageUtilization: 70
        type: Utilization
    type: Resource
  minReplicas: 1
  scaleTargetRef:
    apiVersion: opentelemetry.io/v1alpha1
    kind: OpenTelemetryCollector
    name: tracingcollectorendpoint-collector-857851ca
status:
  conditions:
  - lastTransitionTime: "2022-03-22T11:12:11Z"
    message: the HPA controller was able to get the target's current scale
    reason: SucceededGetScale
    status: "True"
    type: AbleToScale
  - lastTransitionTime: "2022-03-22T11:12:11Z"
    message: the HPA target's scale is missing a selector
    reason: InvalidSelector
    status: "False"
    type: ScalingActive
  currentMetrics: null
  currentReplicas: 1
  desiredReplicas: 0

NOTE: Trick to get specifically an HPA v2beta2 using kubectl:
kubectl get hpa.v2beta2.autoscaling my-hpa

pavolloffay · 2022-03-25T10:37:54Z

Notes about autoscaling API deprecation:

The autoscaling/v2beta2 API version of HorizontalPodAutoscaler will no longer be served in v1.26.

Migrate manifests and API clients to use the autoscaling/v2 API version, available since v1.23.
All existing persisted objects are accessible via the new API

------

The autoscaling/v2beta1 API version of HorizontalPodAutoscaler will no longer be served in v1.25.

Migrate manifests and API clients to use the autoscaling/v2 API version, available since v1.23.
All existing persisted objects are accessible via the new API

secat · 2022-03-25T19:46:10Z

@pavolloffay thank you 🙏 for the heads up!

At least the scale subresource works with any HPA versions.

pavolloffay added the area:collector Issues for deploying collector label Mar 16, 2022

secat changed the title ~~Missing scale subresource status in order to use an HPA resource over an OpenTelemetryCollector CR~~ Add missing scale subresource status in order to use an HPA resource over an OpenTelemetryCollector CR Mar 16, 2022

jpkrohling assigned secat Mar 16, 2022

pavolloffay mentioned this issue Mar 21, 2022

Add scale subresource status to the OpenTelemetryCollector CRD status #785

Merged

yuriolisa mentioned this issue Mar 22, 2022

Rethink what "Replicas" mean in the status object #58

Closed

pavolloffay mentioned this issue Mar 31, 2022

Change our HorizontalPodAutoScaler (HPA) integration to use otelcol /scale subresource #801

Closed

pavolloffay closed this as completed in #785 Apr 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add missing scale subresource status in order to use an HPA resource over an OpenTelemetryCollector CR #775

Add missing scale subresource status in order to use an HPA resource over an OpenTelemetryCollector CR #775

secat commented Mar 16, 2022 •

edited

Loading

secat commented Mar 16, 2022 •

edited

Loading

jpkrohling commented Mar 16, 2022

pavolloffay commented Mar 21, 2022 •

edited

Loading

secat commented Mar 21, 2022

pavolloffay commented Mar 22, 2022

secat commented Mar 22, 2022 •

edited

Loading

pavolloffay commented Mar 25, 2022

secat commented Mar 25, 2022

Add missing scale subresource status in order to use an HPA resource over an OpenTelemetryCollector CR #775

Add missing scale subresource status in order to use an HPA resource over an OpenTelemetryCollector CR #775

Comments

secat commented Mar 16, 2022 • edited Loading

Suggestion

secat commented Mar 16, 2022 • edited Loading

jpkrohling commented Mar 16, 2022

pavolloffay commented Mar 21, 2022 • edited Loading

secat commented Mar 21, 2022

pavolloffay commented Mar 22, 2022

secat commented Mar 22, 2022 • edited Loading

pavolloffay commented Mar 25, 2022

secat commented Mar 25, 2022

secat commented Mar 16, 2022 •

edited

Loading

secat commented Mar 16, 2022 •

edited

Loading

pavolloffay commented Mar 21, 2022 •

edited

Loading

secat commented Mar 22, 2022 •

edited

Loading