Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uses "ClusterFirstWithHostNet" DNS policy when hostNetwork is set to true #690

Closed
gai6948 opened this issue Feb 6, 2022 · 3 comments · Fixed by #691
Closed

Uses "ClusterFirstWithHostNet" DNS policy when hostNetwork is set to true #690

gai6948 opened this issue Feb 6, 2022 · 3 comments · Fixed by #691
Labels
area:collector Issues for deploying collector

Comments

@gai6948
Copy link
Contributor

gai6948 commented Feb 6, 2022

Per the current operator, when trying to deploy otel-collector as agent (using daemonset), setting to hostNetwork is required for pods running on same host to reach the agent, but then communication to otel-collector deployed as gateway inside the cluster fails because of DNS error. Upon investigation it turns out pods using hostNetwork should also set dnsPolicy to ClusterFirstWithHostNet. https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy

Can we make the operator aware of this logic?

My current OpenTelemetryCollector manifest:

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: otel-agent
  namespace: opentelemetry-agent
spec:
  mode: daemonset
  hostNetwork: true
  image: otel/opentelemetry-collector-contrib:0.43.0
  resources:
    limits:
      cpu: 200m
      memory: 128Mi
    requests:
      cpu: 100m
      memory: 64Mi
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
    processors:
      # The k8sattributes in the Agent is in passthrough mode
      # so that it only tags with the minimal info for the
      # collector k8sattributes to complete
      k8sattributes:
        passthrough: true
      memory_limiter:
        check_interval: 1s
        limit_percentage: 50
        spike_limit_percentage: 30
    extensions:
      memory_ballast:
        size_in_percentage: 20    
    exporters:
      otlp:
        endpoint: "otel-collector-collector.opentelemetry-collector.svc.cluster.local:4317"
        tls:
          insecure: true      
      logging:

    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [k8sattributes, memory_limiter]
          exporters: [otlp, logging]

Daemonset that was actually created by the operator:

      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      serviceAccountName: otel-agent-collector
      serviceAccount: otel-agent-collector
      hostNetwork: true
@pavolloffay
Copy link
Member

@gai6948 can you explain why are you using daemonset with hostNetwork?

  mode: daemonset
  hostNetwork: true

Is it because you want to expose the collector outside of cluster or configure instrumentation to report data to the localhost?

@slenky
Copy link

slenky commented Feb 11, 2022

@pavolloffay using that we could reference agent address:

            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.hostIP
            - name: SPRING_ZIPKIN_BASEURL
              value: http://$(HOST_IP):9411

@pavolloffay
Copy link
Member

@gai6948 CI is failing - unit tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:collector Issues for deploying collector
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants