-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gathered hostmetrics process shown in console but not as metric in prometheus #36496
Comments
Pinging code owners: See Adding Labels via Comments if you do not have permissions to add labels yourself. |
/help-wanted receiver/hostmetrics |
@securom1987 do you see any errors logged from the Can you:
|
@securom1987, the format of metric names in OTLP disagrees with the format of metric names in prometheus. I have no experience with the It's clunkier, but this workaround would probably also work if you don't mind having to explicitly map each metric tag. |
Hi, thank you for your reply: Here is the output written to syslog: Only processes from /proc can not be scraped in my opinion... |
Pinging code owners for exporter/prometheus: @Aneurysm9 @dashpole @ArthurSens. See Adding Labels via Comments if you do not have permissions to add labels yourself. For example, comment '/label priority:p2 -needs-triaged' to set the priority and remove the needs-triaged label. |
Can you send an example of the Prometheus scrape? Is it empty? |
Do you mean its config file? # my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- job_name: "otel-collector-Gateway"
scrape_interval: 5s
honor_labels: true
static_configs:
- targets: ["localhost:8889"]`
|
I did not mean the scrape config, though good to know anyway. What I meant was if you curl the prometheus endpoint you started with the collector, what is the output? |
After curling prometheus remote write endpoint: "curl -X POST http://nuc-cloud:9090/api/v1/write" i get "snappy: corrupt input" as an answer. So it should work but think i have to translate the otlp metrics to prometheus friendly metrics as tdg5 already mentioned. But i have no clou how to do that. |
I am referring to the
The |
Otel Collector and Prometheus are running on the same host. curl http://nuc-cloud:8889/metrics
# HELP process_cpu_time_seconds_total Total CPU seconds broken down by different states.
# TYPE process_cpu_time_seconds_total counter
process_cpu_time_seconds_total{job="NUC-CLOUD",state="system"} 0
process_cpu_time_seconds_total{job="NUC-CLOUD",state="user"} 0
process_cpu_time_seconds_total{job="NUC-CLOUD",state="wait"} 0
# HELP process_disk_io_bytes_total Disk bytes transferred.
# TYPE process_disk_io_bytes_total counter
process_disk_io_bytes_total{direction="read",job="NUC-CLOUD"} 1.32905e+06
process_disk_io_bytes_total{direction="write",job="NUC-CLOUD"} 3807
# HELP process_memory_usage_bytes The amount of physical memory in use.
# TYPE process_memory_usage_bytes gauge
process_memory_usage_bytes{job="NUC-CLOUD"} 1.077248e+06
# HELP process_memory_virtual_bytes Virtual memory size.
# TYPE process_memory_virtual_bytes gauge
process_memory_virtual_bytes{job="NUC-CLOUD"} 5.873664e+06
# HELP system_processes_count Total number of processes in each state.
# TYPE system_processes_count gauge
system_processes_count{job="NUC-CLOUD",status="blocked"} 0
system_processes_count{job="NUC-CLOUD",status="idle"} 76
system_processes_count{job="NUC-CLOUD",status="running"} 1
system_processes_count{job="NUC-CLOUD",status="sleeping"} 137
# HELP system_processes_created_total Total number of created processes.
# TYPE system_processes_created_total counter
system_processes_created_total{job="NUC-CLOUD"} 1.471905e+06 |
I understand the problem now. The The Try changing your
For reference, I used this config locally to verify:
|
@braydonk's suggestion should do the trick! Another approach, if you prefer, is to send OTLP directly to Prometheus: https://prometheus.io/docs/guides/opentelemetry/ If you go in that direction, you'll want to take a look at |
I tried this configuration also with the prometheusremotewrite exporter before and i worked very well, so the addition of resource_to_telemetry_conversion
enabled: true did the trick! So following configuration is working nearly the same exporters:
debug:
verbosity: detailed
prometheus:
resource_to_telemetry_conversion:
enabled: true
endpoint: 0.0.0.0:8889
#prometheusremotewrite:
#endpoint: "http://nuc-cloud:9090/api/v1/write"
#resource_to_telemetry_conversion:
#enabled: true Thank you for your help! |
Works as described in last comment |
Component(s)
receiver/hostmetrics
What happened?
Description
As mentioned in description i am using otel collector v0.114 and hostmetrics receiver with processscraper in ubuntu linux.
I want to scrape process information. These are shown in debug output / console for example:
Console output
Nov 22 09:37:35 nuc-cloud otelcol-contrib[1156080]: -> process.owner: Str(root)
Nov 22 09:37:35 nuc-cloud otelcol-contrib[1156080]: InstrumentationScope github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/internal/scraper/processscraper 0.114.0
Nov 22 09:37:35 nuc-cloud otelcol-contrib[1156080]: -> Name: process.cpu.time
Nov 22 09:37:35 nuc-cloud otelcol-contrib[1156080]: -> Name: process.memory.usage
Nov 22 09:37:35 nuc-cloud otelcol-contrib[1156080]: -> Name: process.memory.virtual
Nov 22 09:37:35 nuc-cloud otelcol-contrib[1156080]: -> process.pid: Int(616072)
Nov 22 09:37:35 nuc-cloud otelcol-contrib[1156080]: -> process.parent_pid: Int(1)
Nov 22 09:37:35 nuc-cloud otelcol-contrib[1156080]: -> process.executable.name: Str(loki)
Nov 22 09:37:35 nuc-cloud otelcol-contrib[1156080]: -> process.executable.path: Str()
Nov 22 09:37:35 nuc-cloud otelcol-contrib[1156080]: -> process.command: Str(/usr/bin/loki)
Nov 22 09:37:35 nuc-cloud otelcol-contrib[1156080]: -> process.command_line: Str(/usr/bin/loki -config.file /etc/loki/config.yml)
Nov 22 09:37:35 nuc-cloud otelcol-contrib[1156080]: -> process.owner: Str(loki)
based on this otel-collector config:
extensions:
health_check:
endpoint: 0.0.0.0:1133
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
hostmetrics:
collection_interval: 10s
scrapers:
# CPU utilization metrics
#cpu:
# Disk I/O metrics
# disk:
# File System utilization metrics
#filesystem:
# CPU load metrics
#load:
# Memory utilization metrics
#memory:
# Network interface I/O metrics & TCP connection metrics
#network:
# Paging/Swap space utilization and I/O metrics
#paging:
# Process count metrics
process:
# Per process CPU, Memory, and Disk I/O metrics
processes:
processors:
batch:
resource:
attributes:
- action: insert
key: service.name ## setzt im Grafana in der Metrik die job=HOST1
value: NUC-CLOUD
exporters:
debug:
verbosity: detailed
prometheus:
endpoint: 0.0.0.0:8889
service:
extensions: [health_check]
pipelines:
metrics:
receivers: [otlp, hostmetrics]
processors: [resource, batch]
exporters: [debug, prometheus]
The problem:
The metrics which are written to console are not shown in prometheus.
Collector version
v0.114.0
Environment information
Environment
OS: (e.g., "Ubuntu 24.04")
Compiler(if manually compiled): (e.g., "go 14.2")
OpenTelemetry Collector configuration
Log output
Additional context
Metrics which are shown in console/debug log are not shown in prometheus.
For example process "loki" in log output
The text was updated successfully, but these errors were encountered: