Description
Component(s)
No response
Describe the issue you're reporting
The term "subcomponent" is not officially defined in the OpenTelemetry Collector, but there are a couple of possible definitions that could apply:
- OpenTelemetry components that can be started or stopped at runtime. This typically includes receivers instantiated by mechanisms such as receiver_creator by using the ComponentFactory interface
- Independent units of work within a component whose status can be reported individually. A good example of this is the hostmetrics receiver, where each scraper might want to report its own status. Mdatagen docs refers to each scraper as a subcomponent:
Because these components are not part of the collector's static service graph, they can be more difficult to observe externally.
The collector offers two primary mechanisms for monitoring its components: telemetry (metrics, logs, and traces) and health/status reporting:
- Internal telemetry: If the collector's instrumentation is provided to the subcomponents components (components started by the receiver_creator or internal scrappers), they will contribute to the internal telemetry and generate metrics (and logs/traces):
otelcol_scraper_scraped_metric_points{receiver="hostmetrics",scraper="cpu", ...} 1
otelcol_scraper_scraped_metric_points{receiver="hostmetrics",scraper="disk", ...} 7
- Health checks: The healthcheck extension provides a status endpoint where a user can query all the collector's components status: http://localhost:13133/health/status?verbose → subcomponents components do not show up, no health checks
Currently, there's no way to have "component events" that represent the status of subcomponents.
One solution would be extending the Event structure with a new private subComponentID
field to identify the corresponding subcomponent:
func NewSubComponentEvent(subComponentID component.ID, status Status) *Event {
return &Event{
subComponentID: subComponentID,
status: status,
timestamp: time.Now(),
}
}
The healthcheckv2 extension requires little non-breaking changes to recursively parse the proposed Event structure.
Receiver creator reference issue: open-telemetry/opentelemetry-collector-contrib#39053
Would love some feedback on this, does this approach make sense to you?