Description
Component(s)
extension/encoding/awslogsencoding
Is your feature request related to a problem? Please describe.
I would like to use awslogs_encoding
to unmarshal VPC flow logs.
Describe the solution you'd like
VPC flow logs can be sent to:
- Cloudwatch logs
- S3 - plain text or parquet
- Firehose
VPC flow logs to S3 come in a .gz
compressed file. Example of file content (both plain text / parquet show the same content as):
version account-id interface-id srcaddr dstaddr srcport dstport protocol packets bytes start end action log-status
2 12345678910 eni-0eb1e4178af74336c - - - - - - - 1742569968 1742570020 - NODATA
2 12345678910 eni-0eb1e4178af74336c - - - - - - - 1742570029 1742570081 - NODATA
If sent to CloudWatch logs:
timestamp,message
1742570569000,2 627286350134 eni-0eb1e4178af74336c - - - - - - - 1742570569 1742570622 - NODATA
We can consider that if sent to Cloudwatch, then it will be a Cloudwatch Log. TODO: We would need to likely add a format to a cloudwatch log that does not come from a subscription filter.
I think to support the VPC flow logs currently, we can expect a record similarly to the one that is placed in S3. It is very descriptive, and we know which field is which. For VPC flow logs sent to CloudWatch logs, we do not know that, since it is possible to send them in a custom format.
I expect the configuration to be similar to:
awslogs_encoding:
format: vpc_flow_log
vpc_flow_log:
format: plain-text # this is the default, options [plain-text, parquet]
Describe alternatives you've considered
No response
Additional context
No response