Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource attribute "service.instance.id" is converted to label "instance", conflicting with auto-generated prometheus label #32484

Closed
tqi-raurora opened this issue Apr 17, 2024 · 7 comments
Assignees

Comments

@tqi-raurora
Copy link

Component(s)

exporter/prometheus

What happened?

Description

I'm not sure if this is intended or not, but when a metric is sent with the resource attribute "service.instance.id", the prometheus exporter creates a label named "instance" for the metric.

This label will conflict with the auto-generated instance label:

instance: The : part of the target's URL that was scraped.
https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series

This becomes a problem when we have multiple otel collectors for high availability, because we can end in a scenario where metrics from 2 different otel collector instances end up "merged" in a single timeseries.

Steps to Reproduce

1-Create a simple metric with service.instance.id resource attribute:

metrics.json:

{
  "resourceMetrics": [
    {
      "resource": {
        "attributes": [
          {
            "key": "service.name",
            "value": {
              "stringValue": "my.service"
            }
          },{
            "key": "service.instance.id",
            "value": {
              "stringValue": "123456"
            }
          }
        ]
      },
      "scopeMetrics": [
        {
          "scope": {
            "name": "my.library",
            "version": "1.0.0",
            "attributes": [
              {
                "key": "my.scope.attribute",
                "value": {
                  "stringValue": "some scope attribute"
                }
              }
            ]
          },
          "metrics": [
            {
              "name": "my.gauge",
              "unit": "1",
              "description": "I am a Gauge",
              "gauge": {
                "dataPoints": [
                  {
                    "asDouble": 10,
                    "timeUnixNano": "1544712660300000000",
                    "attributes": [
                      {
                        "key": "my.gauge.attr",
                        "value": {
                          "stringValue": "some value"
                        }
                      }
                    ]
                  }
                ]
              }
            }
          ]
        }
      ]
    }
  ]
}

2-Send the metric to the otel collector:

curl -X POST -H "Content-Type: application/json" -d @metrics.json -i localhost:4318/v1/metrics

Expected Result

What I would propose is that the label created would be named "service_instance_id" instead of "instance"

Actual Result

A label named "instance" is created:

curl localhost:9130/metrics

# HELP my_gauge I am a Gauge
# TYPE my_gauge gauge
my_gauge{instance="123456",job="my.service",my_gauge_attr="some value"} 10
# HELP target_info Target metadata
# TYPE target_info gauge
target_info{instance="123456",job="my.service"} 1

Collector version

otelcol-contrib version 0.97.0

Environment information

No response

OpenTelemetry Collector configuration

receivers:
  otlp:
    protocols:
      grpc:
      http:

exporters:
  prometheus:
    endpoint: "0.0.0.0:9130"
    add_metric_suffixes: false
    resource_to_telemetry_conversion:
      enabled: false

service:
  pipelines:
    metrics:
      receivers: [otlp,]
      processors: []
      exporters: [prometheus]

Log output

No response

Additional context

No response

@tqi-raurora tqi-raurora added bug Something isn't working needs triage New item requiring triage labels Apr 17, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@tqi-raurora
Copy link
Author

@Aneurysm9 Hi! Could you give your opinion on this? Thanks in advance

Copy link
Contributor

github-actions bot commented Sep 4, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Sep 4, 2024
@dashpole
Copy link
Contributor

dashpole commented Sep 4, 2024

This is intentional and comes from here: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/compatibility/prometheus_and_openmetrics.md#resource-attributes

In prometheus, it is expected that job and instance are identifying for a series. We rely on that to ensure joins with the target_info metric work properly.

If you are scraping the prometheus exporter, I would encourage you to set honor_labels=true when doing so to preserve the original service.name and service.instance.id to prevent collisions of this sort.

@dashpole dashpole removed Stale needs triage New item requiring triage labels Sep 4, 2024
@dashpole dashpole self-assigned this Sep 4, 2024
Copy link
Contributor

github-actions bot commented Nov 4, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Nov 4, 2024
Copy link
Contributor

github-actions bot commented Jan 3, 2025

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants