Description
Component(s)
exporter/prometheusremotewrite
Describe the issue you're reporting
We had an ongoing issue with the Prometheus Remote Write Exporter where all metrics would fail to send for 15 minutes at a time and then it would recover on its own. The published stats from the internal telemetry showed otelcol_exporter_sent_metric_points_total
drop to 0. The queue size (otelcol_exporter_queue_size
) remained at 0, and the http endpoint that the PRWE was configured to hit did not receive any traffic. We enabled debug logging, but the PRWE seems to not write any debug logs. The error message we saw is:
error internal/queue_sender.go:46 Exporting failed. Dropping data. {"otelcol.component.id": "prometheusremotewrite/observe", "otelcol.component.kind": "Exporter", "otelcol.signal": "metrics", "error": "Permanent error: Permanent error: context deadline exceeded; Permanent error: Permanent error: context deadline exceeded", "errorCauses": [{"error": "Permanent error: Permanent error: context deadline exceeded"}, {"error": "Permanent error: Permanent error: context deadline exceeded"}], "dropped_items": 638}
/home/runner/work/observe-agent/observe-agent/vendor/go.opentelemetry.io/collector/exporter/exporterqueue/async_queue.go:47
go.opentelemetry.io/collector/exporter/exporterqueue.(*asyncQueue[...]).Start.func1
/home/runner/work/observe-agent/observe-agent/vendor/go.opentelemetry.io/collector/exporter/exporterhelper/internal/batcher/disabled_batcher.go:23
go.opentelemetry.io/collector/exporter/exporterhelper/internal/batcher.(*disabledBatcher[...]).Consume
/home/runner/work/observe-agent/observe-agent/vendor/go.opentelemetry.io/collector/exporter/exporterhelper/internal/queue_sender.go:46
go.opentelemetry.io/collector/exporter/exporterhelper/internal.NewQueueSender.func1
This issue was present in v0.118.0, but we upgraded to v0.124.0 and it seems to be fixed. I am hoping to better understand the root cause since it was never listed as a bug fix under the PRWE (I am guessing it may have been related to an exporter helper fix). Can anyone help point me to what the fix might have been? Thank you!