Skip to content

[receiver/vcenter] Network Packet Metrics Have Metadata Issues #32835

Closed
@StefanKurek

Description

@StefanKurek

Component(s)

receiver/vcenter

What happened?

Description

Currently there are some issues with the way the packet metrics are presented & their metadata. The vcenter.*.network.packet.count metrics are both incorrectly marked with rate units in the metadata. They are also marked as non-monotonic cumulative sums. The actual returned datapoints represent delta sums of packets transmitted over previous 20s intervals in succession.

A similar issue exists for vcenter.host.network.packet.errors, but only for the discrepancy between non-monotonic cumulative sum & delta sums.

Steps to Reproduce

Collect against any vCenter environment with VMs.

Expected Result

vcenter.vm.network.packet.count would be returned where each datapoint represents a packet transmission rate (for each VM).
vcenter.host.network.packet.count is returned with datapoints each representing a packet transmission rate (for each Host).
vcenter.host.network.packet.errors is returned with datapoints each representing a packet error rate (for each Host).

Actual Result

vcenter.vm.network.packet.count is returned with a single datapoint representing 20s of accumulated packet count data (for each VM).
vcenter.host.network.packet.count is returned with 5 datapoints each representing the previous 20s of accumulated packet count data (for each Host).
vcenter.host.network.packet.errors is returned with 5 datapoints each representing the previous 20s of accumulated packet error count data (for each Host).

Collector version

v1.6.0/v0.99.0

Environment information

No response

OpenTelemetry Collector configuration

extensions:
  basicauth/prom:
    client_auth:
      username: [PROMUSER]
      password: [PROMPASS]

exporters:
  prometheusremotewrite:
    endpoint: [PROMENDPOINT]
    auth:
      authenticator: basicauth/prom
    resource_to_telemetry_conversion:
      enabled: true # Convert resource attributes to metric labels

processors:
  batch:
    # https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor/batchprocessor

receivers:
  vcenter:
    endpoint: https://[VCENTERHOST]
    username: [VCENTERUSER]
    password: [VCENTERPASS]
    tls:
      insecure: true
    collection_interval: 1m
    initial_delay: 1s

service:
  extensions: [basicauth/prom]
  pipelines:
    metrics:
      receivers: [vcenter]
      processors: [batch]
      exporters: [prometheusremotewrite]

Log output

No response

Additional context

@djaglowski @schmikei I don't actually want to change the metadata to monotonic cumulative deltas. This would cause these metrics to not work with the prometheus exporters, and that would be problematic for my current use case.

Instead, I think it might make more sense to do something like convert them to rates (by dividing the values returned by the interval). We could make/keep the units as rates in this case. We then could either convert to Gauges (or I guess keep as they are as I don't think it hurts anything).

If we want to make it a rate, we also have the option to use a larger interval (5m) to get the deltas for these.

Whether or not we convert to rates, we also have the option to backfill these datapoints to try and "fill up" the collection interval to make this data more "honest" (not sure if that is the right word) to the end user.

In my use case (Importing into Grafana with a prometheus based exporter which just ends up taking the latest datapoint), I could always just take my sample and convert to a 20s rate on the Grafana side of things.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions