Description
Component(s)
receiver/windowsperfcounters
What happened?
Description
Whenever a multi instance counter is scraped and there are multiple instances with the same name (e.g. Process\ID Process
for notepad.exe) the receiver scrapes all instances, but puts the exact same label value in instance
. This is incompatible with most backends as the metrics will be treated as the same time series and either aggregated or only the last datapoint will be kept.
The behavior also does not match what PerfMon shows, which would be notepad
and notepad#1
in my example above.
Steps to Reproduce
- Start Notepad.exe as your normal user
- Start Notepad.exe as an administrator (to ensure you have 2 different Notepad.exe PIDs on Windows 11)
- Use the provided configuration file (modify as needed)
- Mimir optional, I was testing another issue with
prometheusremotewrite
- Mimir optional, I was testing another issue with
Expected Result
Windows Performance Monitor handles this by concatenating the instance name with its index when there are multiple occurrences of the same instance (usually when multiple instances of a process are running):
- Metrics for instances
notepad
andnotepad_1
as shown in Windows Performance Monitor - Two time series with each PID value
Actual Result
- Two data points for instance
notepad
combined in the same time series:
Collector version
0.97
Environment information
Environment
Windows 11
go 1.22 on Ubuntu 22.04 (GOOS=windows
)
OpenTelemetry Collector configuration
receivers:
windowsperfcounters:
metrics:
process.pid:
gauge:
collection_interval: 5s
perfcounters:
- object: Process
instances: "note*"
counters:
- name: "ID Process"
metric: process.pid
processors:
batch:
memory_limiter:
check_interval: 1s
limit_mib: 500
spike_limit_mib: 100
extensions:
exporters:
prometheusremotewrite:
endpoint: http://mimir:9009/api/v1/push
tls:
insecure: true
debug:
verbosity: detailed
service:
extensions: []
pipelines:
metrics:
receivers: [windowsperfcounters]
processors: []
exporters: [debug, prometheusremotewrite]
Log output
2024-04-11T07:05:15.889-0400 info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "debug", "resource metrics": 1, "metrics": 1, "data points": 2} // <------ Two data points
2024-04-11T07:05:15.889-0400 info ResourceMetrics #0
Resource SchemaURL:
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope
Metric #0
Descriptor:
-> Name: process.pid
-> Description:
-> Unit:
-> DataType: Gauge
NumberDataPoints #0
Data point attributes:
-> instance: Str(Notepad) // <---------
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-04-11 11:03:44.8430194 +0000 UTC
Value: 16660.000000
NumberDataPoints #1
Data point attributes:
-> instance: Str(Notepad) // <-------
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-04-11 11:03:44.8430194 +0000 UTC
Value: 21988.000000
{"kind": "exporter", "data_type": "metrics", "name": "debug"}
Additional context
I already have a PR that I can submit for this. I understand that this might be a problem in terms of cardinality so I am open to gating this behind a config option for the receiver.