Skip to content

GCP detector ignores context #1026

Open
@RonFed

Description

@RonFed

The gcp detector in the resourcesdetection processor ignores the context:
https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/58d93b20516223707ec8de05bd47f579c6ab03fc/processor/resourcedetectionprocessor/internal/gcp/gcp.go#L55

As a result the timeout configuration for the processor is not applied to metadata server queries in:

func (d *Detector) ProjectID() (string, error) {
// N.B. d.metadata.ProjectIDWithContext(context.TODO()) is cached globally, so if we use it here it's untestable.
s, err := d.metadata.GetWithContext(context.TODO(), "project/project-id")
return strings.TrimSpace(s), err
}
// instanceID returns the ID of the project in which this program is running.
func (d *Detector) instanceID() (string, error) {
// N.B. d.metadata.InstanceIDWithContext(context.TODO()) is cached globally, so if we use it here it's untestable.
s, err := d.metadata.GetWithContext(context.TODO(), "instance/id")
return strings.TrimSpace(s), err
}

As an example, having the following configuration:

      resourcedetection:
        detectors:
        - gcp
        timeout: 2s

results in 10 seconds init delay to the processor:

2025-03-06T07:45:06.697Z	info	internal/resourcedetection.go:137	began detecting resource information	{"otelcol.component.id": "resourcedetection", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "traces", "otelcol.signal": "traces"}
2025-03-06T07:45:16.750Z	info	internal/resourcedetection.go:188	detected resource information	{"otelcol.component.id": "resourcedetection", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "traces", "otelcol.signal": "traces", "resource": {}}

In the above example it took 10 seconds for the processor to initialize, at this point the collector is not in Ready state which fails readiness probes. It looks like the time to initialize is not bounded and this can lead to timeout on the readiness probe which results in the collector not running and being in a CrashLoopBackoff state in k8s.

This happens in setups that don't run in GCP.

cc @damemi

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions