Description
Component(s)
pkg/stanza
Describe the issue you're reporting
Stanza pipelines are built with a very aggressive sampling policy.
If I'm reading and understanding this correctly, this means that every second, 1 initial log of a particular message is allowed, then only 1/10000 logs of that particular message is allowed.
This seems overly aggressive. In fact, it actually makes the file consumer only log the first log it finds on startup here:
https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/2d23e4d9e0eb313da5d192e1066f444d19b8601f/pkg/stanza/fileconsumer/file.go#L188C35-L188C35
Meaning, if you have 10 files that are picked up on startup, only one of them is logged. That poses a challenge for debugging whether a file is picked up at all.
In my opinion, we should rely on the logger already being sampled. The collector already supports sampling. It also samples by default with initial = 100, and thereafter = 100.