Skip to content

Prometheus receiver intermittently dropping kubelet metrics #40696

Open
@alclark704

Description

@alclark704

Component(s)

exporter/promethuesremotewrite

What happened?

Description

We're using the Prometheus Receiver as a drop in replacement for Prometheus and it's mostly working as expected but we're currently having issues with gaps in the collector kubelet metrics. We have a pool of collectors and Target Allocator deployment, all managed by an OpentelemetryCollector resource by the Opentelemetry Operator, with the collectors writing to an upstream Thanos receive stack. We provide the same static config to the collectors that we provide to Prometheus (which we still currently have running alongside) and what we see is intermittent gaps in the metrics coming from kubelet, just one or 2 dropped scrapes but they cause gaps in the series that skew all of our node based metrics.

There's been some discussion on the CNCF slack so far: https://cloud-native.slack.com/archives/C01LSCJBXDZ/p1748621317097949

But so far we have investigated:

  • The collectors aren't under any resource strain when dropping series
  • The drops are only happening in kubernetes-nodes and kubernetes-nodes-cadvisor jobs
  • The drops are not consistent across all nodes a particular collector is scraping (i.e. all nodes from a particular collector experience it but not at the same time)
  • The drops don't correlate to collectors restarting or scaling or nodes starting/stopping
  • There's nothing obvious in the collector logs, even with detailed logging (some timeouts to the apiserver but those seem to relate to nodes starting/stopping rather than these gaps)
  • Prometheus still running in the same cluster with the same config does not have the same problem

I'm not entirely sure the issue is with the prometheus receiver, given it uses the same scrape logic as Prometheus itself I see no reason why it would have any issue. It could well be differences in the way the collector/exporter manage the series, particularly the kubelet ones with honor_timestamps: true but struggling to narrow down what exactly is happening.

Steps to Reproduce

Otel collector with the below static config, with prometheus remotewrite exporter to Thanos receive stack.

Expected Result

This is from Prometheus in the same cluster monitoring up{job="kubernetes-nodes"} for a given instance:

Image

Actual Result

This is the same instance via the collector:

Image

Collector version

0.122.1

Environment information

Environment

EKS: v1.31.9-eks-5d4a308

OpenTelemetry Collector configuration

exporters:
      debug: {}
      prometheusremotewrite:
        endpoint: ${THANOS_RECEIVER}
        remote_write_queue:
          enabled: true
          num_consumers: 5
        target_info:
          enabled: false
        timeout: 10s
    processors:
      memory_limiter:
        check_interval: 1s
        limit_percentage: 75
        spike_limit_percentage: 15
    receivers:
      prometheus:
        api_server:
          enabled: true
          server_config:
            endpoint: localhost:9090
        config:
          global:
            scrape_interval: 15s
          scrape_configs:
          - bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
            honor_timestamps: true
            job_name: kubernetes-nodes
            kubernetes_sd_configs:
            - role: node
            relabel_configs:
            - action: labelmap
              regex: __meta_kubernetes_node_label_(.+)
            - replacement: kubernetes.default.svc:443
              target_label: __address__
            - regex: (.+)
              replacement: /api/v1/nodes/$$1/proxy/metrics
              source_labels:
              - __meta_kubernetes_node_name
              target_label: __metrics_path__
            scheme: https
            tls_config:
              ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
              insecure_skip_verify: true
          - bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
            honor_timestamps: true
            job_name: kubernetes-nodes-cadvisor
            kubernetes_sd_configs:
            - role: node
            metric_relabel_configs:
            - action: drop
              regex: container_cpu_(load_average_10s|system_seconds_total|user_seconds_total)
              source_labels:
              - __name__
            - action: drop
              regex: container_fs_(io_current|io_time_seconds_total|io_time_weighted_seconds_total|reads_merged_total|sector_reads_total|sector_writes_total|writes_merged_total)
              source_labels:
              - __name__
            - action: drop
              regex: container_memory_(mapped_file|swap)
              source_labels:
              - __name__
            - action: drop
              regex: container_(tasks_state|threads_max)
              source_labels:
              - __name__
            - action: drop
              regex: container_spec_(cpu.*|memory_swap_limit_bytes|memory_reservation_limit_bytes)
              source_labels:
              - __name__
            - action: drop
              regex: .+;
              source_labels:
              - id
              - pod
            relabel_configs:
            - action: labelmap
              regex: __meta_kubernetes_node_label_(.+)
            - replacement: kubernetes.default.svc:443
              target_label: __address__
            - regex: (.+)
              replacement: /api/v1/nodes/$$1/proxy/metrics/cadvisor
              source_labels:
              - __meta_kubernetes_node_name
              target_label: __metrics_path__
            scheme: https
            tls_config:
              ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
              insecure_skip_verify: true
          - job_name: kubernetes-service-endpoints
            kubernetes_sd_configs:
            - role: endpoints
            relabel_configs:
...
          - job_name: kubernetes-pods
            kubernetes_sd_configs:
            - role: pod
            relabel_configs:
...
        target_allocator:
          collector_id: ${POD_NAME}
          endpoint: http://opentelemetry-metrics-targetallocator
          interval: 30s
    service:
      pipelines:
        metrics:
          exporters:
          - prometheusremotewrite
          - debug
          processors:
          - memory_limiter
          receivers:
          - prometheus

Log output

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions