Skip to content

Parquet Index out of bounds, seems to be related to attribuites #3370

@JavaPerformance

Description

@JavaPerformance

Describe the bug
Tempo 2.3.1 and HEAD, vParquet2 vParquet3, all versions of OTEL jars from 1.28.0 to 1.34.1. When setting more than 2 attributes (well it seems that way) to a trace I get the following panic when subsequently searching for or requesting a trace using Grafana (10.3.1) . This is with custom Java instrumentation using the OTEL GRPC Exporter.

To Reproduce
N/A

Expected behavior
No Panic

Environment:
All running on IBM zSeries zCX (which is basically Linux s390x Docker, I built everything locally)

level=info ts=2024-02-05T16:47:04.375394627Z caller=module_service.go:82 msg=starting module=overrides-api
level=info ts=2024-02-05T16:47:04.375820294Z caller=module_service.go:82 msg=starting module=query-frontend
level=info ts=2024-02-05T16:47:04.37706481Z caller=app.go:205 msg="Tempo started"
2024/02/05 16:47:17 http: superfluous response.WriteHeader call from github.com/opentracing-contrib/go-stdlib/nethttp.(*statusCodeTracker).WriteHeader (status-code-tracker.go:17)
level=info ts=2024-02-05T16:47:21.291170542Z caller=handler.go:134 tenant=single-tenant method=GET traceID=0f9d31ca39cde943 url=/api/v2/search/tag/name/values duration=3.346961ms response_size=2 status=200
level=info ts=2024-02-05T16:47:21.291492019Z caller=handler.go:134 tenant=single-tenant method=GET traceID=19b19edf85d482c3 url=/api/v2/search/tag/status/values duration=805.789µs response_size=117 status=200
level=info ts=2024-02-05T16:47:21.291862437Z caller=handler.go:134 tenant=single-tenant method=GET traceID=793637e8641d4ebc url=/api/v2/search/tag/resource.service.name/values duration=1.855ms response_size=2 status=200
level=info ts=2024-02-05T16:47:21.407236446Z caller=handler.go:134 tenant=single-tenant method=GET traceID=23dd039f241fd9c7 url="/api/v2/search/tag/resource.service.name/values?q=%7B%7D" duration=805.168µs response_size=2 status=200
level=info ts=2024-02-05T16:47:21.464914571Z caller=handler.go:134 tenant=single-tenant method=GET traceID=1b87a53359fb95c1 url="/api/v2/search/tag/status/values?q=%7B%7D" duration=1.006112ms response_size=117 status=200
level=info ts=2024-02-05T16:47:21.465171424Z caller=handler.go:134 tenant=single-tenant method=GET traceID=4012e9a2444b5f9c url="/api/v2/search/tag/name/values?q=%7B%7D" duration=958.965µs response_size=2 status=200
level=info ts=2024-02-05T16:47:22.268794294Z caller=searchsharding.go:253 msg="sharded search query request stats and SearchMetrics" tenant=single-tenant query="q={}&limit=20&spss=3&start=1707130042&end=1707151642" duration_seconds=1.513395ms request_throughput=0 total_requests=1 started_requests=1 cancelled_requests=0 finished_requests=1 totalBlocks=0 inspectedBytes=0 inspectedTraces=0 totalBlockBytes=0
level=info ts=2024-02-05T16:47:22.269073544Z caller=handler.go:134 tenant=single-tenant method=GET traceID=45ea7086fe10b7e3 url="/api/search?q=%7B%7D&limit=20&spss=3&start=1707130042&end=1707151642" duration=1.814402ms response_size=57 status=200
level=info ts=2024-02-05T16:47:59.476109305Z caller=distributor.go:689 msg=received spanid=65f003adeb610f35 traceid=439a1a658925d3531518f8fe08c24243
level=info ts=2024-02-05T16:48:07.288417111Z caller=searchsharding.go:253 msg="sharded search query request stats and SearchMetrics" tenant=single-tenant query="q={}&limit=20&spss=3&start=1707130087&end=1707151687" duration_seconds=2.319437ms request_throughput=5.082267808955363e+06 total_requests=1 started_requests=1 cancelled_requests=0 finished_requests=1 totalBlocks=0 inspectedBytes=11788 inspectedTraces=0 totalBlockBytes=0
level=info ts=2024-02-05T16:48:07.288618898Z caller=handler.go:134 tenant=single-tenant method=GET traceID=511c2470df2e010d url="/api/search?q=%7B%7D&limit=20&spss=3&start=1707130087&end=1707151687" duration=2.587267ms response_size=82 status=200
level=info ts=2024-02-05T16:48:09.795568132Z caller=searchsharding.go:253 msg="sharded search query request stats and SearchMetrics" tenant=single-tenant query="q={}&limit=20&spss=3&start=1707149889&end=1707151689" duration_seconds=1.977487ms request_throughput=5.961101134925286e+06 total_requests=1 started_requests=1 cancelled_requests=0 finished_requests=1 totalBlocks=0 inspectedBytes=11788 inspectedTraces=0 totalBlockBytes=0
level=info ts=2024-02-05T16:48:09.795821177Z caller=handler.go:134 tenant=single-tenant method=GET traceID=4dc6cbd05ebca887 url="/api/search?q=%7B%7D&limit=20&spss=3&start=1707149889&end=1707151689" duration=2.254715ms response_size=82 status=200
level=info ts=2024-02-05T16:48:10.577220303Z caller=searchsharding.go:253 msg="sharded search query request stats and SearchMetrics" tenant=single-tenant query="q={}&limit=20&spss=3&start=1707149890&end=1707151690" duration_seconds=1.89249ms request_throughput=6.228830799634344e+06 total_requests=1 started_requests=1 cancelled_requests=0 finished_requests=1 totalBlocks=0 inspectedBytes=11788 inspectedTraces=0 totalBlockBytes=0
level=info ts=2024-02-05T16:48:10.577342163Z caller=handler.go:134 tenant=single-tenant method=GET traceID=69cca1a10408d3d7 url="/api/search?q=%7B%7D&limit=20&spss=3&start=1707149890&end=1707151690" duration=2.039142ms response_size=82 status=200
panic: runtime error: index out of range [16777216] with length 3

goroutine 1765 [running]:
github.com/parquet-go/parquet-go.(*byteArrayPage).index(...)
/root/tempo/vendor/github.com/parquet-go/parquet-go/page.go:984
github.com/parquet-go/parquet-go.(*byteArrayDictionary).lookupString(0xc000716700, {0xc0018a8000, 0x2, 0x400}, {{0xc00182ed10, 0x2, 0x18}})
/root/tempo/vendor/github.com/parquet-go/parquet-go/dictionary_purego.go:42 +0x176
github.com/parquet-go/parquet-go.(*byteArrayDictionary).Lookup(0xc000716700, {0xc0018a8000, 0x2, 0x400}, {0xc00182ed10, 0x2, 0xaa})
/root/tempo/vendor/github.com/parquet-go/parquet-go/dictionary.go:748 +0x156
github.com/parquet-go/parquet-go.(*indexedPageValues).ReadValues(0xc0016c5680, {0xc00182ed10, 0x2, 0xaa})
/root/tempo/vendor/github.com/parquet-go/parquet-go/dictionary.go:1332 +0xd6
github.com/parquet-go/parquet-go.(*repeatedPageValues).ReadValues(0xc0007baac0, {0xc00182ed10, 0xaa, 0xaa})
/root/tempo/vendor/github.com/parquet-go/parquet-go/page_values.go:98 +0x192
github.com/parquet-go/parquet-go.(*rowGroupRows).readRows(0xc000709b20, {0xc0016f6ff0, 0x1, 0x1})
/root/tempo/vendor/github.com/parquet-go/parquet-go/row_group.go:384 +0x1f6
github.com/parquet-go/parquet-go.(*rowGroupRows).ReadRows(0xc000709b20, {0xc0016f6ff0, 0x1, 0x1})
/root/tempo/vendor/github.com/parquet-go/parquet-go/row_group.go:361 +0x1ca
github.com/parquet-go/parquet-go.(*reader).ReadRows(0xc0016d8310, {0xc0016f6ff0, 0x1, 0x1})
/root/tempo/vendor/github.com/parquet-go/parquet-go/reader.go:526 +0x114
github.com/parquet-go/parquet-go.(*Reader).Read(0xc0016d82d0, {0x23211c0, 0xc0007bca00})
/root/tempo/vendor/github.com/parquet-go/parquet-go/reader.go:407 +0x29c
github.com/grafana/tempo/tempodb/encoding/vparquet3.(*walBlock).FindTraceByID(0xc0007c6600, {0x300dbe0, 0xc001578c00}, {0xc001541830, 0x10, 0x10}, {0x400000, 0x0, 0x0, 0x4c4b40, ...})
/root/tempo/tempodb/encoding/vparquet3/wal_block.go:530 +0x334
github.com/grafana/tempo/modules/ingester.(*instance).FindTraceByID(0xc0007e2b00, {0x300dbe0, 0xc001578bd0}, {0xc001541830, 0x10, 0x10})
/root/tempo/modules/ingester/instance.go:441 +0x466
github.com/grafana/tempo/modules/ingester.(*Ingester).FindTraceByID(0xc0007c1800, {0x300dbe0, 0xc001578ba0}, 0xc00157a730)
/root/tempo/modules/ingester/ingester.go:280 +0x1f8
github.com/grafana/tempo/pkg/tempopb._Querier_FindTraceByID_Handler.func1({0x300dbe0, 0xc001578ba0}, {0x2810f80, 0xc00157a730})
/root/tempo/pkg/tempopb/tempo.pb.go:3553 +0x88
github.com/grafana/tempo/cmd/tempo/app.glob..func2({0x300dbe0, 0xc001578b70}, {0x2810f80, 0xc00157a730}, 0xc00071c760, 0xc001570450)
/root/tempo/cmd/tempo/app/fake_auth.go:22 +0xa2
google.golang.org/grpc.getChainUnaryHandler.func1({0x300dbe0, 0xc001578b70}, {0x2810f80, 0xc00157a730})
/root/tempo/vendor/google.golang.org/grpc/server.go:1163 +0xe2
github.com/grafana/dskit/middleware.UnaryServerInstrumentInterceptor.func1({0x300dbe0, 0xc001578b70}, {0x2810f80, 0xc00157a730}, 0xc00071c760, 0xc0016905c0)
/root/tempo/vendor/github.com/grafana/dskit/middleware/grpc_instrumentation.go:33 +0x9e
google.golang.org/grpc.getChainUnaryHandler.func1({0x300dbe0, 0xc001578b70}, {0x2810f80, 0xc00157a730})
/root/tempo/vendor/google.golang.org/grpc/server.go:1163 +0xe2
github.com/grafana/dskit/server.newServer.HTTPGRPCTracingInterceptor.func2({0x300dbe0, 0xc001578b70}, {0x2810f80, 0xc00157a730}, 0xc00071c760, 0xc001690580)
/root/tempo/vendor/github.com/grafana/dskit/middleware/http_tracing.go:75 +0xa84
google.golang.org/grpc.getChainUnaryHandler.func1({0x300dbe0, 0xc001578b70}, {0x2810f80, 0xc00157a730})
/root/tempo/vendor/google.golang.org/grpc/server.go:1163 +0xe2
github.com/opentracing-contrib/go-grpc.OpenTracingServerInterceptor.func1({0x300dbe0, 0xc001578ab0}, {0x2810f80, 0xc00157a730}, 0xc00071c760, 0xc001690440)
/root/tempo/vendor/github.com/opentracing-contrib/go-grpc/server.go:57 +0x426
google.golang.org/grpc.getChainUnaryHandler.func1({0x300dbe0, 0xc001578ab0}, {0x2810f80, 0xc00157a730})
/root/tempo/vendor/google.golang.org/grpc/server.go:1163 +0xe2
github.com/grafana/dskit/middleware.GRPCServerLog.UnaryServerInterceptor({{0x2fea680, 0xc0006c47d0}, 0x0, 0x0}, {0x300dbe0, 0xc001578ab0}, {0x2810f80, 0xc00157a730}, 0xc00071c760, 0xc001690400)
/root/tempo/vendor/github.com/grafana/dskit/middleware/grpc_logging.go:49 +0x7e
google.golang.org/grpc.NewServer.chainUnaryServerInterceptors.chainUnaryInterceptors.func1({0x300dbe0, 0xc001578ab0}, {0x2810f80, 0xc00157a730}, 0xc00071c760, 0xc001570450)
/root/tempo/vendor/google.golang.org/grpc/server.go:1154 +0xb8
github.com/grafana/tempo/pkg/tempopb._Querier_FindTraceByID_Handler({0x28b9ce0, 0xc0007c1800}, {0x300dbe0, 0xc001578ab0}, 0xc000766280, 0xc0008980e0)
/root/tempo/pkg/tempopb/tempo.pb.go:3555 +0x17c
google.golang.org/grpc.(*Server).processUnaryRPC(0xc0007e6960, {0x300dbe0, 0xc001578960}, {0x3024780, 0xc0008649c0}, 0xc001577320, 0xc0004f9560, 0x465dce0, 0x0)
/root/tempo/vendor/google.golang.org/grpc/server.go:1343 +0x10b4
google.golang.org/grpc.(*Server).handleStream(0xc0007e6960, {0x3024780, 0xc0008649c0}, 0xc001577320)
/root/tempo/vendor/google.golang.org/grpc/server.go:1737 +0xf54
google.golang.org/grpc.(*Server).serveStreams.func1.1()
/root/tempo/vendor/google.golang.org/grpc/server.go:986 +0xb0
created by google.golang.org/grpc.(*Server).serveStreams.func1 in goroutine 996
/root/tempo/vendor/google.golang.org/grpc/server.go:997 +0x1a8
level=warn ts=2024-02-05T16:48:46.71380189Z caller=main.go:129 msg="-- CONFIGURATION WARNINGS --"
level=warn ts=2024-02-05T16:48:46.713851245Z caller=main.go:135 msg="Span logging is enabled. This is for debuging only and not recommended for production deployments."
level=info ts=2024-02-05T16:48:46.713863065Z caller=main.go:224 msg="initialising OpenTracing tracer"
level=info ts=2024-02-05T16:48:46.714872771Z caller=main.go:117 msg="Starting Tempo" version="(version=main-3ed534a, branch=main, revision=3ed534a07)"
level=info ts=2024-02-05T16:48:46.715173433Z caller=cache.go:54 msg="caches available to storage backend" footer=false bloom=false offset_idx=false column_idx=false trace_id_idx=false page=false
level=info ts=2024-02-05T16:48:46.715967272Z caller=server.go:238 msg="server listening on addresses" http=[::]:3200 grpc=[::]:9095
level=info ts=2024-02-05T16:48:46.717550795Z caller=frontend.go:40 msg="creating middleware in query frontend"
level=info ts=2024-02-05T16:48:46.717584677Z caller=cache.go:24 msg="init frontend cache" role=frontend-search enabled=false
level=info ts=2024-02-05T16:48:46.71777527Z caller=tempodb.go:504 msg="polling enabled" interval=5m0s blocklist_concurrency=50
level=info ts=2024-02-05T16:48:46.717845979Z caller=poller.go:133 msg="blocklist poll complete" seconds=4.6825e-05
level=warn ts=2024-02-05T16:48:46.718592237Z caller=modules.go:287 msg="Worker address is empty in single binary mode. Attempting automatic worker configuration. If queries are unresponsive consider configuring the worker explicitly." address=127.0.0.1:9095
level=info ts=2024-02-05T16:48:46.718781694Z caller=worker.go:104 msg="Starting querier worker connected to query-frontend" frontend=127.0.0.1:9095
ts=2024-02-05T16:48:46Z level=info msg="OTel Shim Logger Initialized" component=tempo
ts=2024-02-05T16:48:46Z level=warn msg="jaeger receiver will deprecate Thrift-gen and replace it with Proto-gen to be compatbible to jaeger 1.42.0 and higher. See open-telemetry/opentelemetry-collector-contrib#18485 for more details." component=tempo
level=info ts=2024-02-05T16:48:46.720608579Z caller=module_service.go:82 msg=starting module=internal-server
level=info ts=2024-02-05T16:48:46.7207053Z caller=module_service.go:82 msg=starting module=server
level=info ts=2024-02-05T16:48:46.720786477Z caller=module_service.go:82 msg=starting module=memberlist-kv
level=info ts=2024-02-05T16:48:46.720814711Z caller=module_service.go:82 msg=starting module=usage-report
level=info ts=2024-02-05T16:48:46.721013819Z caller=module_service.go:82 msg=starting module=secondary-ring
level=info ts=2024-02-05T16:48:46.721041165Z caller=module_service.go:82 msg=starting module=cache-provider
level=info ts=2024-02-05T16:48:46.721074152Z caller=module_service.go:82 msg=starting module=metrics-generator-ring
level=info ts=2024-02-05T16:48:46.72110308Z caller=ring.go:276 msg="ring doesn't exist in KV store yet"
level=info ts=2024-02-05T16:48:46.721158724Z caller=module_service.go:82 msg=starting module=store
level=info ts=2024-02-05T16:48:46.721189541Z caller=module_service.go:82 msg=starting module=overrides
level=info ts=2024-02-05T16:48:46.721203958Z caller=module_service.go:82 msg=starting module=overrides-api
level=info ts=2024-02-05T16:48:46.721220253Z caller=module_service.go:82 msg=starting module=ingester
level=info ts=2024-02-05T16:48:46.721240378Z caller=ingester.go:353 msg="beginning wal replay"
level=warn ts=2024-02-05T16:48:46.721278686Z caller=wal.go:111 msg="unowned file entry ignored during wal replay" file=blocks err=null
level=info ts=2024-02-05T16:48:46.721650195Z caller=wal.go:121 msg="beginning replay" file=d1b598ac-2d9b-404a-9f05-7df487821f1e+single-tenant+vParquet3 size=58
level=info ts=2024-02-05T16:48:46.722396815Z caller=wal.go:148 msg="replay complete" file=d1b598ac-2d9b-404a-9f05-7df487821f1e+single-tenant+vParquet3 duration=751.427µs
ts=2024-02-05T16:48:46Z level=warn msg="Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks" component=tempo documentation=https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks
ts=2024-02-05T16:48:46Z level=warn msg="Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks" component=tempo documentation=https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks
ts=2024-02-05T16:48:46Z level=warn msg="Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks" component=tempo documentation=https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks
level=info ts=2024-02-05T16:48:46.723370167Z caller=module_service.go:82 msg=starting module=ring
level=info ts=2024-02-05T16:48:46.723441752Z caller=ring.go:276 msg="ring doesn't exist in KV store yet"
level=info ts=2024-02-05T16:48:46.724476697Z caller=ingester.go:391 msg="wal replay complete"
level=info ts=2024-02-05T16:48:46.724500069Z caller=ingester.go:405 msg="reloading local blocks" tenants=0
level=info ts=2024-02-05T16:48:46.725128932Z caller=lifecycler.go:614 msg="not loading tokens from file, tokens file path is empty"
level=info ts=2024-02-05T16:48:46.725161713Z caller=lifecycler.go:639 msg="instance not found in ring, adding with no tokens" ring=ingester
level=info ts=2024-02-05T16:48:46.7252404Z caller=lifecycler.go:483 msg="auto-joining cluster after timeout" ring=ingester
level=info ts=2024-02-05T16:48:46.72534482Z caller=module_service.go:82 msg=starting module=compactor
level=info ts=2024-02-05T16:48:46.725361044Z caller=tempodb.go:504 msg="polling enabled" interval=5m0s blocklist_concurrency=50
level=info ts=2024-02-05T16:48:46.725393056Z caller=poller.go:133 msg="blocklist poll complete" seconds=1.8171e-05
level=info ts=2024-02-05T16:48:46.725401463Z caller=compactor.go:159 msg="enabling compaction"
level=info ts=2024-02-05T16:48:46.725407949Z caller=tempodb.go:476 msg="compaction and retention enabled."
level=info ts=2024-02-05T16:48:46.725486101Z caller=module_service.go:82 msg=starting module=metrics-generator
level=info ts=2024-02-05T16:48:46.725924505Z caller=basic_lifecycler.go:297 msg="instance not found in the ring" instance=5ab89ffea868 ring=metrics-generator
level=info ts=2024-02-05T16:48:46.726126505Z caller=module_service.go:82 msg=starting module=query-frontend
level=info ts=2024-02-05T16:48:46.727159639Z caller=module_service.go:82 msg=starting module=distributor
ts=2024-02-05T16:48:46Z level=warn msg="Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks" component=tempo documentation=https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks
ts=2024-02-05T16:48:46Z level=info msg="Starting GRPC server" component=tempo endpoint=0.0.0.0:4317
ts=2024-02-05T16:48:46Z level=warn msg="Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks" component=tempo documentation=https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks
ts=2024-02-05T16:48:46Z level=info msg="Starting HTTP server" component=tempo endpoint=0.0.0.0:4318
level=info ts=2024-02-05T16:48:46.730070144Z caller=module_service.go:82 msg=starting module=querier
level=info ts=2024-02-05T16:48:46.730262991Z caller=worker.go:180 msg="adding connection" addr=127.0.0.1:9095
ts=2024-02-05T16:48:46Z level=warn msg="Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks" component=tempo documentation=https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks
level=info ts=2024-02-05T16:48:46.730637433Z caller=app.go:205 msg="Tempo started"

tempo.yaml

server:
http_listen_port: 3200

query_frontend:
search:
duration_slo: 30s
throughput_bytes_slo: 1.073741824e+09
trace_by_id:
duration_slo: 30s

distributor:
receivers:
jaeger:
protocols:
thrift_http:
grpc:
thrift_binary:
thrift_compact:
zipkin:
otlp:
protocols:
http:
grpc:
opencensus:
log_received_spans:
enabled: true

ingester:
max_block_duration: 5m

compactor:
compaction:
block_retention: 720h

metrics_generator:
registry:
external_labels:
source: tempo
cluster: docker-compose
storage:
path: /var/lib/tempo/generator/wal
remote_write:
- url: http://XXXXXXX:9090/api/v1/write
send_exemplars: true

storage:
trace:
backend: local
wal:
path: /var/lib/tempo/wal
version: vParquet3
local:
path: /var/lib/tempo/blocks
block:
version: vParquet3
parquet_row_group_size_bytes: 10000000
parquet_dedicated_columns:
- {scope: span, name: LogID, type: string}

overrides:
defaults:
metrics_generator:
processors: []

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleUsed for stale issues / PRs

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions