-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Is your feature request related to a problem? Please describe.
I'm using OTEL as a sidecar with ECS Services. I use it to parse and filter StatsD Metrics, that AppMesh/Envoy produces, and then I use emfexporter
to put the metrics to Cloudwatch via Cloudwatch Log Stream. This mostly works. However, when my ECS Service scales to multiple instances, I often see following error in my logs:
2022-03-07T10:05:33.439Z warn [email protected]/cwlog_client.go:84 cwlog_client: Error occurs in PutLogEvents, will search the next token and retry the request
{
"kind": "exporter",
"name": "awsemf/statsd/envoy_metrics",
"error": "InvalidSequenceTokenException: The given sequenceToken is invalid. The next expected sequenceToken is: 49626189108498028449043455519612405404976381845984773650\n{\n RespMetadata: {\n StatusCode: 400,\n RequestID: \"a24432dd-4d17-44ae-b245-3877cfffabb7\"\n },\n ExpectedSequenceToken: \"49626189108498028449043455519612405404976381845984773650\",\n Message_: \"The given sequenceToken is invalid. The next expected sequenceToken is: 49626189108498028449043455519612405404976381845984773650\"\n}"
}
This is caused by race-condition - now, two nodes write to the same log-stream in cloudwatch, and they corrupt each ther's sequenceToken
that AWS API Required to put logs to CloudWatch.
Describe the solution you'd like
I was hoping to additionally configure resourcedetection
processor:
"resource":
"attributes":
- "action": "insert"
"from_attribute": "aws.ecs.task.id"
"key": "TaskId"
- "action": "insert"
"from_attribute": "aws.ecs.task.arn"
"key": "TaskARN"
"resourcedetection":
"detectors":
- "env"
- "ecs"
so that I would be able to use the {TaskId}
dynamic field when configuring emfexporter
, like this:
"awsemf/statsd/envoy_metrics":
"dimension_rollup_option": "NoDimensionRollup"
"log_group_name": "/aws/ecs/dev/hello-world"
"log_stream_name": "emf/otel/statsd/envoy_metrics/{TaskId}"
"namespace": "dev/AppMeshEnvoy"
However, when I run my service, I can see that only the following is detected by resourcedetection
:
2022-03-07T13:11:17.808Z info internal/resourcedetection.go:139 detected resource information
{
"kind": "processor",
"name": "resourcedetection",
"resource": {
"aws.ecs.cluster.arn": "arn:aws:ecs:eu-west-1:506501033716:cluster/dev",
"aws.ecs.launchtype": "fargate",
"aws.ecs.task.arn": "arn:aws:ecs:eu-west-1:506501033716:task/dev/1a8d528834e046b183d4913feeaa16bc",
"aws.ecs.task.family": "dev-hello-world",
"aws.ecs.task.revision": "43",
"cloud.account.id": "506501033716",
"cloud.availability_zone": "eu-west-1a",
"cloud.platform": "aws_ecs",
"cloud.provider": "aws",
"cloud.region": "eu-west-1"
}
}
Describe alternatives you've considered
Tried to use TaskARN, but that just lead to not having LogStream created at all. Most likely, the reason is that TaskARNs contain characters that are illegal for LogStream Name, the the emfexporter
fails silently, not being able to create one.
Additional context
N/A.