Skip to content

Add support for VPC flow log #38861

Open
Open
@constanca-m

Description

@constanca-m

Component(s)

extension/encoding/awslogsencoding

Is your feature request related to a problem? Please describe.

I would like to use awslogs_encoding to unmarshal VPC flow logs.

Describe the solution you'd like

VPC flow logs can be sent to:

  • Cloudwatch logs
  • S3 - plain text or parquet
  • Firehose

VPC flow logs to S3 come in a .gz compressed file. Example of file content (both plain text / parquet show the same content as):

version account-id interface-id srcaddr dstaddr srcport dstport protocol packets bytes start end action log-status
2 12345678910 eni-0eb1e4178af74336c - - - - - - - 1742569968 1742570020 - NODATA
2 12345678910 eni-0eb1e4178af74336c - - - - - - - 1742570029 1742570081 - NODATA

If sent to CloudWatch logs:

timestamp,message
1742570569000,2 627286350134 eni-0eb1e4178af74336c - - - - - - - 1742570569 1742570622 - NODATA

We can consider that if sent to Cloudwatch, then it will be a Cloudwatch Log. TODO: We would need to likely add a format to a cloudwatch log that does not come from a subscription filter.

I think to support the VPC flow logs currently, we can expect a record similarly to the one that is placed in S3. It is very descriptive, and we know which field is which. For VPC flow logs sent to CloudWatch logs, we do not know that, since it is possible to send them in a custom format.

I expect the configuration to be similar to:

awslogs_encoding:
  format: vpc_flow_log
  vpc_flow_log:
    format: plain-text # this is the default, options [plain-text, parquet]

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions