Description
Component(s)
receiver/kafka
What happened?
Description
Release 0.124.0 updated the kafka receiver's topic
and encoding
fields.
0.124.0+ Collectors using text_utf-8
as their log::encoding
encounter this error:
receiver: invalid component type: invalid character(s) in type "text_utf-8"
As part of the update, this PR made a change that errors if the -
character is used in the encoding
. See this function, specifically component.NewType(encoding)
:
// encodingToComponentID converts an encoding string to a component ID using the given encoding as type.
func encodingToComponentID(encoding string) (*component.ID, error) {
componentType, err := component.NewType(encoding)
if err != nil {
return nil, fmt.Errorf("invalid component type: %w", err)
}
id := component.NewID(componentType)
return &id, nil
}
NewType creates a type. It returns an error if the type is invalid. A type must - have at least one character, - start with an ASCII alphabetic character and - can only contain ASCII alphanumeric characters and '_'.
Looking at the func newLogsUnmarshaler
in the same file, it looks like utf8
and utf16
are the expected format now, but the readme still recommends utf-8
and the default appears to still be utf-8
. There is a test validating usage of utf16
but not utf8
or utf-8
.
I do not have Kafka set up but a collector will error because of this even before complaining that there are no brokers to connect to.
Steps to Reproduce
Run a 0.124.0 collector with a kafkareceiver using text_utf-8
as the log encoding. It will immediately error due to the -
in the encoding. The config I shared is still using the old encoding/topic fields, but nesting them under logs:
instead still hits the same error.
Expected Result
The receiver should not error when using a hyphenated value like text_utf-8
as the log encoding.
Furthermore, the recommended and default format for text encoding should work. If the breaking change was intentional, the documentation should be updated accordingly.
Tests should be added to cover this case.
Actual Result
The collector errors due to the -
in the log encoding.
Collector version
0.124.0
Environment information
Environment
OS: macOS/darwin Sequoia 15.0.1
Compiler(if manually compiled): go 1.24.0
OpenTelemetry Collector configuration
receivers:
kafka/logs:
brokers:
- localhost:9092
client_id: otel-collector
encoding: text_utf-8
group_id: otel-collector
metadata:
full: true
protocol_version: 2.0.0
topic: otlp_logs
exporters:
nop/devnull: null
service:
pipelines:
logs:
receivers:
- kafka/logs
processors: []
exporters:
- nop/devnull
telemetry:
metrics:
readers:
- pull:
exporter:
prometheus:
host: localhost
port: 8888
Log output
cannot start pipelines: failed to start "kafka/logs" receiver: invalid component type: invalid character(s) in type "text_utf-8"
Additional context
No response