Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky target allocator test #913

Closed
pavolloffay opened this issue Jun 7, 2022 · 2 comments · Fixed by #928
Closed

Fix flaky target allocator test #913

pavolloffay opened this issue Jun 7, 2022 · 2 comments · Fixed by #928
Labels
area:target-allocator Issues for target-allocator

Comments

@pavolloffay
Copy link
Member

The TA smoke test (https://github.com/open-telemetry/opentelemetry-operator/tree/main/tests/e2e/smoke-targetallocator) seems to be intermittently failing.

Failed test:

=== CONT  kuttl/harness/smoke-targetallocator
    logger.go:42: 07:49:22 | smoke-targetallocator/2-install | test step failed 2-install
    case.go:361: failed in step 2-install
    case.go:363: --- Deployment:kuttl-test-evolved-starfish/stateful-targetallocator
        +++ Deployment:kuttl-test-evolved-starfish/stateful-targetallocator
        @@ -1,9 +1,207 @@
         apiVersion: apps/v1
         kind: Deployment
         metadata:
        +  labels:
        +    app.kubernetes.io/component: opentelemetry-targetallocator
        +    app.kubernetes.io/instance: kuttl-test-evolved-starfish.stateful
        +    app.kubernetes.io/managed-by: opentelemetry-operator
        +    app.kubernetes.io/name: stateful-targetallocator
        +    app.kubernetes.io/part-of: opentelemetry
        +  managedFields:
        +  - apiVersion: apps/v1
        +    fieldsType: FieldsV1
        +    fieldsV1:
        +      f:metadata:
        +        f:annotations:
        +          .: {}
        +          f:deployment.kubernetes.io/revision: {}
        +      f:status:
        +        f:conditions:
        +          .: {}
        +          k:{"type":"Available"}:
        +            .: {}
        +            f:lastTransitionTime: {}
        +            f:lastUpdateTime: {}
        +            f:message: {}
        +            f:reason: {}
        +            f:status: {}
        +            f:type: {}
        +          k:{"type":"Progressing"}:
        +            .: {}
        +            f:lastTransitionTime: {}
        +            f:lastUpdateTime: {}
        +            f:message: {}
        +            f:reason: {}
        +            f:status: {}
        +            f:type: {}
        +        f:observedGeneration: {}
        +        f:replicas: {}
        +        f:unavailableReplicas: {}
        +        f:updatedReplicas: {}
        +    manager: kube-controller-manager
        +    operation: Update
        +    subresource: status
        +    time: "2022-06-07T07:46:52Z"
        +  - apiVersion: apps/v1
        +    fieldsType: FieldsV1
        +    fieldsV1:
        +      f:metadata:
        +        f:labels:
        +          .: {}
        +          f:app.kubernetes.io/component: {}
        +          f:app.kubernetes.io/instance: {}
        +          f:app.kubernetes.io/managed-by: {}
        +          f:app.kubernetes.io/name: {}
        +          f:app.kubernetes.io/part-of: {}
        +        f:ownerReferences:
        +          .: {}
        +          k:{"uid":"1da5bb41-8289-43d6-a3e6-333cfd008ac7"}: {}
        +      f:spec:
        +        f:progressDeadlineSeconds: {}
        +        f:replicas: {}
        +        f:revisionHistoryLimit: {}
        +        f:selector: {}
        +        f:strategy:
        +          f:rollingUpdate:
        +            .: {}
        +            f:maxSurge: {}
        +            f:maxUnavailable: {}
        +          f:type: {}
        +        f:template:
        +          f:metadata:
        +            f:labels:
        +              .: {}
        +              f:app.kubernetes.io/component: {}
        +              f:app.kubernetes.io/instance: {}
        +              f:app.kubernetes.io/managed-by: {}
        +              f:app.kubernetes.io/name: {}
        +              f:app.kubernetes.io/part-of: {}
        +          f:spec:
        +            f:containers:
        +              k:{"name":"ta-container"}:
        +                .: {}
        +                f:env:
        +                  .: {}
        +                  k:{"name":"OTELCOL_NAMESPACE"}:
        +                    .: {}
        +                    f:name: {}
        +                    f:valueFrom:
        +                      .: {}
        +                      f:fieldRef: {}
        +                f:image: {}
        +                f:imagePullPolicy: {}
        +                f:name: {}
        +                f:resources: {}
        +                f:terminationMessagePath: {}
        +                f:terminationMessagePolicy: {}
        +                f:volumeMounts:
        +                  .: {}
        +                  k:{"mountPath":"/conf"}:
        +                    .: {}
        +                    f:mountPath: {}
        +                    f:name: {}
        +            f:dnsPolicy: {}
        +            f:restartPolicy: {}
        +            f:schedulerName: {}
        +            f:securityContext: {}
        +            f:serviceAccount: {}
        +            f:serviceAccountName: {}
        +            f:terminationGracePeriodSeconds: {}
        +            f:volumes:
        +              .: {}
        +              k:{"name":"ta-internal"}:
        +                .: {}
        +                f:configMap:
        +                  .: {}
        +                  f:defaultMode: {}
        +                  f:items: {}
        +                  f:name: {}
        +                f:name: {}
        +    manager: manager
        +    operation: Update
        +    time: "2022-06-07T07:46:52Z"
           name: stateful-targetallocator
           namespace: kuttl-test-evolved-starfish
        +  ownerReferences:
        +  - apiVersion: opentelemetry.io/v1alpha1
        +    blockOwnerDeletion: true
        +    controller: true
        +    kind: OpenTelemetryCollector
        +    name: stateful
        +    uid: 1da5bb41-8289-43d6-a3e6-333cfd008ac7
        +spec:
        +  progressDeadlineSeconds: 600
        +  replicas: 1
        +  revisionHistoryLimit: 10
        +  selector:
        +    matchLabels:
        +      app.kubernetes.io/component: opentelemetry-targetallocator
        +      app.kubernetes.io/instance: kuttl-test-evolved-starfish.stateful
        +      app.kubernetes.io/managed-by: opentelemetry-operator
        +      app.kubernetes.io/name: stateful-targetallocator
        +      app.kubernetes.io/part-of: opentelemetry
        +  strategy:
        +    rollingUpdate:
        +      maxSurge: 25%!
        (MISSING)+      maxUnavailable: 25%!
        (MISSING)+    type: RollingUpdate
        +  template:
        +    metadata:
        +      creationTimestamp: null
        +      labels:
        +        app.kubernetes.io/component: opentelemetry-targetallocator
        +        app.kubernetes.io/instance: kuttl-test-evolved-starfish.stateful
        +        app.kubernetes.io/managed-by: opentelemetry-operator
        +        app.kubernetes.io/name: stateful-targetallocator
        +        app.kubernetes.io/part-of: opentelemetry
        +    spec:
        +      containers:
        +      - env:
        +        - name: OTELCOL_NAMESPACE
        +          valueFrom:
        +            fieldRef:
        +              apiVersion: v1
        +              fieldPath: metadata.namespace
        +        image: local/opentelemetry-operator-targetallocator:e2e
        +        imagePullPolicy: IfNotPresent
        +        name: ta-container
        +        resources: {}
        +        terminationMessagePath: /dev/termination-log
        +        terminationMessagePolicy: File
        +        volumeMounts:
        +        - mountPath: /conf
        +          name: ta-internal
        +      dnsPolicy: ClusterFirst
        +      restartPolicy: Always
        +      schedulerName: default-scheduler
        +      securityContext: {}
        +      serviceAccount: stateful-collector
        +      serviceAccountName: stateful-collector
        +      terminationGracePeriodSeconds: 30
        +      volumes:
        +      - configMap:
        +          defaultMode: 420
        +          items:
        +          - key: targetallocator.yaml
        +            path: targetallocator.yaml
        +          name: stateful-targetallocator
        +        name: ta-internal
         status:
        -  readyReplicas: 1
        +  conditions:
        +  - lastTransitionTime: "2022-06-07T07:46:52Z"
        +    lastUpdateTime: "2022-06-07T07:46:52Z"
        +    message: Deployment does not have minimum availability.
        +    reason: MinimumReplicasUnavailable
        +    status: "False"
        +    type: Available
        +  - lastTransitionTime: "2022-06-07T07:46:52Z"
        +    lastUpdateTime: "2022-06-07T07:46:52Z"
        +    message: ReplicaSet "stateful-targetallocator-66fdc796fc" is progressing.
        +    reason: ReplicaSetUpdated
        +    status: "True"
        +    type: Progressing
        +  observedGeneration: 1
           replicas: 1
        +  unavailableReplicas: 1
        +  updatedReplicas: 1
    case.go:363: resource Deployment:kuttl-test-evolved-starfish/stateful-targetallocator: .status.readyReplicas: key is missing from map
    logger.go:42: 07:49:22 | smoke-targetallocator | smoke-targetallocator events from ns kuttl-test-evolved-starfish:
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:46:52 +0000 UTC	Normal	PersistentVolumeClaim default-volume-stateful-collector-0		WaitForFirstConsumer	waiting for first consumer to be created before binding		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:46:52 +0000 UTC	Normal	PersistentVolumeClaim default-volume-stateful-collector-0		ExternalProvisioning	waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:46:52 +0000 UTC	Normal	StatefulSet.apps stateful-collector		SuccessfulCreate	create Claim default-volume-stateful-collector-0 Pod stateful-collector-0 in StatefulSet stateful-collector success		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:46:52 +0000 UTC	Normal	StatefulSet.apps stateful-collector		SuccessfulCreate	create Pod stateful-collector-0 in StatefulSet stateful-collector successful		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:46:52 +0000 UTC	Normal	Pod stateful-targetallocator-66fdc796fc-2qg7h		Scheduled	Successfully assigned kuttl-test-evolved-starfish/stateful-targetallocator-66fdc796fc-2qg7h to kind-control-plane		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:46:52 +0000 UTC	Normal	ReplicaSet.apps stateful-targetallocator-66fdc796fc		SuccessfulCreate	Created pod: stateful-targetallocator-66fdc796fc-2qg7h		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:46:52 +0000 UTC	Normal	Deployment.apps stateful-targetallocator		ScalingReplicaSet	Scaled up replica set stateful-targetallocator-66fdc796fc to 1		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:46:53 +0000 UTC	Warning	Pod stateful-targetallocator-66fdc796fc-2qg7h		FailedMount	MountVolume.SetUp failed for volume "ta-internal" : failed to sync configmap cache: timed out waiting for the condition		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:46:54 +0000 UTC	Normal	Pod stateful-targetallocator-66fdc796fc-2qg7h.spec.containers{ta-container}		Pulled	Container image "local/opentelemetry-operator-targetallocator:e2e" already present on machine		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:46:54 +0000 UTC	Normal	Pod stateful-targetallocator-66fdc796fc-2qg7h.spec.containers{ta-container}		Created	Created container ta-container		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:46:55 +0000 UTC	Normal	Pod stateful-targetallocator-66fdc796fc-2qg7h.spec.containers{ta-container}		Started	Started container ta-container		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:46:56 +0000 UTC	Warning	Pod stateful-targetallocator-66fdc796fc-2qg7h.spec.containers{ta-container}		BackOff	Back-off restarting failed container		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:46:59 +0000 UTC	Normal	PersistentVolumeClaim default-volume-stateful-collector-0		Provisioning	External provisioner is provisioning volume for claim "kuttl-test-evolved-starfish/default-volume-stateful-collector-0"		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:47:14 +0000 UTC	Normal	PersistentVolumeClaim default-volume-stateful-collector-0		ProvisioningSucceeded	Successfully provisioned volume pvc-7c426ab2-8bea-4c74-8327-10df641ccb87		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:47:14 +0000 UTC	Normal	Pod stateful-collector-0		Scheduled	Successfully assigned kuttl-test-evolved-starfish/stateful-collector-0 to kind-control-plane		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:47:16 +0000 UTC	Normal	Pod stateful-collector-0.spec.containers{otc-container}		Pulled	Container image "ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector:0.51.0" already present on machine		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:47:16 +0000 UTC	Normal	Pod stateful-collector-0.spec.containers{otc-container}		Created	Created container otc-container		
    logger.go:42: 07:49:22 | smoke-targetallocator | 2022-06-07 07:47:16 +0000 UTC	Normal	Pod stateful-collector-0.spec.containers{otc-container}		Started	Started container otc-container	
@pavolloffay pavolloffay added the area:target-allocator Issues for target-allocator label Jun 7, 2022
@pavolloffay
Copy link
Member Author

Locally I am getting this output from the TA container

 k logs stateful-targetallocator-5954c9bcc-jxdbv -n kuttl-test-knowing-camel                                                                                                                                                                                                   ploffay@fedora
{"level":"info","ts":1654681359.3868444,"msg":"Starting the Target Allocator"}
{"level":"error","ts":1654681359.3868706,"logger":"setup","msg":"Can't start the watcher","error":"too many open files","stacktrace":"github.com/open-telemetry/opentelemetry-operator/cmd/otel-allocator/watcher.newConfigMapWatcher\n\t/app/watcher/file.go:19\ngithub.com/open-telemetry/opentelemetry-operator/cmd/otel-allocator/watcher.NewWatcher\n\t/app/watcher/main.go:43\nmain.main\n\t/app/main.go:40\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"}
{"level":"error","ts":1654681359.386908,"logger":"setup","msg":"Can't start the watchers","error":"too many open files","stacktrace":"main.main\n\t/app/main.go:42\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"}

@pavolloffay
Copy link
Member Author

https://github.com/open-telemetry/opentelemetry-operator/runs/6877201778?check_suite_focus=true

=== CONT  kuttl/harness/smoke-targetallocator
    logger.go:42: 09:21:14 | smoke-targetallocator/2-install | test step failed 2-install
    logger.go:42: 09:21:14 | smoke-targetallocator/2-install | collecting log output for [type==pod,label: app.kubernetes.io/component=opentelemetry-targetallocator]
    logger.go:42: 09:21:14 | smoke-targetallocator/2-install | running command: [kubectl logs --prefix -l app.kubernetes.io/component=opentelemetry-targetallocator -n kuttl-test-dominant-airedale --all-containers --tail=10]
    logger.go:42: 09:21:14 | smoke-targetallocator/2-install | [pod/stateful-targetallocator-d966cf585-chrgw/ta-container] {"level":"info","ts":1655198411.0970092,"msg":"Starting the Target Allocator"}
    logger.go:42: 09:21:14 | smoke-targetallocator/2-install | [pod/stateful-targetallocator-d966cf585-chrgw/ta-container] {"level":"error","ts":1655198411.1046784,"logger":"allocator","msg":"Pod failure","component":"opentelemetry-targetallocator","error":"pods is forbidden: User \"system:serviceaccount:kuttl-test-dominant-airedale:stateful-collector\" cannot list resource \"pods\" in API group \"\" in the namespace \"kuttl-test-dominant-airedale\"","stacktrace":"main.configureFileDiscovery\n\t/app/main.go:154\nmain.newServer\n\t/app/main.go:121\nmain.main\n\t/app/main.go:55\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:target-allocator Issues for target-allocator
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant