Skip to content

[receiver/googlecloudmonitoring: Support distribution metric #40216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

Joffref
Copy link

@Joffref Joffref commented May 22, 2025

Description

Support for the DISTRIBUTION metric kind. Learn more: https://cloud.google.com/monitoring/api/v3/kinds-and-types#metric-kinds

Link to tracking issue

Fixes #39600

Testing

I've tested this patch with the following metrics:

  • run.googleapis.com/container/cpu/utilizations
  • run.googleapis.com/container/memory/utilizations

It likely needs broader testing—I'm happy to try it out with more relevant Google metrics.

@Joffref Joffref changed the title [receiver/googlecloudmonitoringreceiver]: Support distribution [receiver/googlecloudmonitoring: Support distribution metric May 22, 2025
@atoulme
Copy link
Contributor

atoulme commented May 22, 2025

Please add tests and a changelog.

@Joffref
Copy link
Author

Joffref commented May 22, 2025

@atoulme I'm on my way to add the changelog and fix the linting errors. I don't see where I should add tests—could you guide me? I only see some tests related to receiver creation.

@atoulme
Copy link
Contributor

atoulme commented May 22, 2025

at minimum a set of unit tests calling ConvertDistributionToMetrics and covering the main use cases, so that your new code addition has test coverage. Codeowners can give you more concrete guidelines.


// Handle bucket options based on type
if bucketOpts := distValue.BucketOptions; bucketOpts != nil {
if expBuckets := bucketOpts.GetExponentialBuckets(); expBuckets != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: This is not doing any translation for bucket boundaries/not translating to an OTel exponential histogram. How does this work?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I got confused here between the GCM data model and the OTEL one — I’ll take a deeper look. Moving this PR to draft, and happy to incorporate your additions!

@bripkens
Copy link
Contributor

@Joffref happy to collaborate on this one. I have been wanting to finish up a similar PR to make exponential GCM distributions work. Can share parts of that tomorrow/Saturday most likely.

@Joffref
Copy link
Author

Joffref commented May 22, 2025

Hey @bripkens! Sure, happy to have your help—I'm still quite new to contributing to OTEL. I'll address your comment later.

@Joffref Joffref marked this pull request as draft May 23, 2025 03:02
Copy link
Contributor

@dashpole dashpole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some pointers to help

if point.Interval.EndTime != nil && point.Interval.EndTime.IsValid() {
dp.SetTimestamp(pcommon.NewTimestampFromTime(point.Interval.EndTime.AsTime()))
} else {
mb.logger.Warn("EndTime is invalid for metric:", zap.String("Metric", ts.GetMetric().GetType()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set the end time to the current time in this case?


distValue := point.Value.GetDistributionValue()
if distValue == nil {
mb.logger.Warn("Distribution value is nil for metric:", zap.String("Metric", ts.GetMetric().GetType()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do this check before we AppendEmpty to the datapoints so that we don't pass through empty distribution points?

dp.SetCount(uint64(distValue.Count))

// Handle bucket options based on type
if bucketOpts := distValue.BucketOptions; bucketOpts != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Set exemplars
if exemplars := distValue.Exemplars; exemplars != nil {
for _, exemplar := range exemplars {
dp.Exemplars().AppendEmpty().SetDoubleValue(float64(exemplar.Value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will want to make sure you get the trace ID and span ID from the SpanContext exemplar attachment. You will also want to set the exemplar's filtered attributes from the DroppedLabels attachment. See https://github.com/GoogleCloudPlatform/opentelemetry-operations-go/blob/33641345ef4756e48d7011fbb38a5c64ec6b3827/exporter/collector/metrics.go#L967

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jun 12, 2025
songy23 pushed a commit that referenced this pull request Jun 16, 2025
…40688)

#### Description

This adds support for converting Google Cloud monitoring delta
distribution metrics to OpenTelemetry histograms.

#### Link to tracking issue

This is related to
#39600,
although it might not be a complete solution, since this only
contributes delta distributions, but not cumulative distributions.

#### Testing

This has been tested extensively with the metric
`run.googleapis.com/request_latencies`.

Here is a comparison of the p95 of that metric directly in Google Cloud
and of the OTel metric as produced by the new code in a third party tool
(Dash0).

<img width="1742" alt="20250612_request_latencies_comparison"
src="https://github.com/user-attachments/assets/dd88d338-3abd-4dfb-a969-72af7c345a5c"
/>

There charts have minor differences which I suspect are introduced by
different interpolation strategies in the two tools (we only have one
datapoint per minute), but besides that, the charts clearly show the
same data -- peaks and valleys are at the same timestamps and the y-axis
values also match.

There are also comprehensive unit tests for the new functionality in
`receiver/googlecloudmonitoringreceiver/internal/metrics_conversion_test.go`.

#### Remarks

Unfortunately I only saw #40216 after I had more or less finished my
implementation here. I realize that this is not ideal, since there are
now two different PRs that attempt to add the same feature. I am not
sure if the author of #40216 is still working on their PR actively. I
think the last status there seems to be that tests are missing. I'm
happy to port over changes from #40216 in case it has things that are
missing here or help porting over changes from here to #40216, if that
helps. That said, this PR has a full set of unit tests and it also has
support for converting the span context exemplar (although that
certainly needs to be double checked, see inline comments).

---------

Co-authored-by: Ben Blackmore <[email protected]>
@Joffref
Copy link
Author

Joffref commented Jun 16, 2025

Duplicate of #40688, great job @basti1302!

@Joffref Joffref closed this Jun 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support ingesting gcp metrics with value type being Distribution
5 participants