Skip to content

labelsHashToGlobal is not being garbage collected #7003

Open
@kchestnov

Description

@kchestnov

What's wrong?

Grafana-agent 0.39.0 in a flow mode via statefulset with clustering enabled does not release some of its memory consumed by GetOrAddGlobalRefID and GetOrAddLink

I believe this is related to the fact that not all cases are covered by this function https://github.com/grafana/agent/blob/v0.39.0/service/labelstore/service.go#L239

From the graphs I can confirm that agent_labelstore_global_ids_count is not decreasing over time, the only event that helps is restart.

~/grafana_heaps: go tool pprof grafana-agent-1_heap.out*                                                                                                            
File: grafana-agent
Build ID: 54e00906a3fcdad65d14ca518d1296d9b729021b
Type: inuse_space
Time: Jul 25, 2024 at 12:11pm (CEST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 14010.80MB, 86.70% of 16160.64MB total
Dropped 427 nodes (cum <= 80.80MB)
Showing top 10 nodes out of 75
      flat  flat%   sum%        cum   cum%
 5452.44MB 33.74% 33.74%  5452.44MB 33.74%  github.com/grafana/agent/service/labelstore.(*service).GetOrAddGlobalRefID
 5072.37MB 31.39% 65.13%  5072.37MB 31.39%  github.com/grafana/agent/service/labelstore.(*service).GetOrAddLink
 1164.51MB  7.21% 72.33%  1164.51MB  7.21%  github.com/prometheus/prometheus/storage/remote.labelsToLabelsProto.func1
  501.75MB  3.10% 75.44%   501.75MB  3.10%  github.com/prometheus/prometheus/model/labels.(*ScratchBuilder).Labels (inline)
  457.26MB  2.83% 78.27%   457.26MB  2.83%  github.com/prometheus/prometheus/model/labels.(*Builder).Labels
  336.94MB  2.08% 80.35%   336.94MB  2.08%  github.com/prometheus/prometheus/scrape.newScrapePool.func1
  281.97MB  1.74% 82.10%   803.55MB  4.97%  github.com/prometheus/prometheus/storage/remote.(*QueueManager).StoreSeries
     270MB  1.67% 83.77%      270MB  1.67%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addRef
  260.37MB  1.61% 85.38%   260.37MB  1.61%  github.com/golang/snappy.Encode
  213.19MB  1.32% 86.70%   213.19MB  1.32%  github.com/prometheus/common/model.LabelSet.Merge
(pprof) 

image (4)
image (5)
image (6)

Steps to reproduce

Run grafana-agent in high cardinality environment in a flow-mode

System information

k8s 1.26.12

Software version

v0.39.0

Configuration

No response

Logs

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingneeds-attentionAn issue or PR has been sitting around and needs attention.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions