Description
Component(s)
receiver/tcplog
What happened?
Description
Customer has multiple processes (clients) connecting via TCP to our agent.
They noticed the final "Log Events" were corrupted (i.e. random extra characters and/or replaced characters)
The root cause is this line of code:
https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/operator/input/tcp/tcp.go#L292
func (t *Input) handleMessage(ctx context.Context, conn net.Conn, log []byte) {
decoded, err := t.encoding.Decode(log)
There are separate goroutines for each accepted TCP connection.
And each one can call this in parallel.
However, the Encoding
struct uses a single shared buffer:
https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/operator/helper/encoding.go#L30-L52
decodeBuffer: make([]byte, 1<<12),
This is not the case in the open source version of Stanza.
Latest Stanza created a new buffer in each call to Decode()
:
https://github.com/observIQ/stanza/blob/main/operator/helper/encoding.go#L44-L62
So I recommend sync with latest Stanza.
Steps to Reproduce
- Start many client programs that write different "LogEvents" messages. Vary the size and contents.
- Monitor the output of the collector/agent regardless of which exporter is used it should reproduce.
Expected Result
- The message sent by client is exactly what gets exported.
Actual Result
- corruption
Collector version
0.77.0
Environment information
Environment
OS: AL2
Compiler(if manually compiled): go 1.20
OpenTelemetry Collector configuration
No response
Log output
No response
Additional context
No response