Skip to content

feat: add pg_egress_collect service #486

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jan 25, 2023
Merged

feat: add pg_egress_collect service #486

merged 6 commits into from
Jan 25, 2023

Conversation

burmecia
Copy link
Member

@burmecia burmecia commented Jan 19, 2023

What kind of change does this PR introduce?

This PR is to add service for postgres egress metric collection. See more context: https://www.notion.so/supabase/Fix-DB-egress-metric-190164a5c4e5444cbbe20c726cea241d

What is the current behavior?

Currently there is no reliable way to gather accurate database-only egress, so the whole server egress is used instead.

What is the new behavior?

The solution is using tcpdump to capture outgoing TCP packets on port 5432 and 6543, and a perl script extract packet length and sum up to one minute interval. The result is saved to text file at /tmp/pg_egress_collect.txt by default. Admin API then can read metric from that file and expose it to Victoria Metrics.

By using this approach, we can collect accurate network egress for postgres and expose it in admin api so Victoria Metrics can gather for downstream pipeline.

Additional context

This metric will also be as data source for admin api, see more details: https://github.com/supabase/supabase-admin-api/pull/130

@burmecia burmecia self-assigned this Jan 19, 2023
@burmecia burmecia added the enhancement New feature or request label Jan 19, 2023
@burmecia burmecia marked this pull request as ready for review January 19, 2023 06:29
@burmecia burmecia requested a review from a team as a code owner January 19, 2023 06:29
@darora
Copy link
Contributor

darora commented Jan 19, 2023

@burmecia have you collected data on performance overhead for this across a few projects with different utilizations?

@burmecia
Copy link
Member Author

I tried several projects including my personal projects and RevOps project, not much overhead I can see. CPU usage is unnoticeable, memory is about 10M and only write one line to a tmp file per minute.

@burmecia
Copy link
Member Author

burmecia commented Jan 20, 2023

Tested its performance on my project, below are the setups:

  • Used sql to generate random serial data and read it to local using psql.
    SELECT time, random()*100 as cpu_usage
    FROM generate_series(now() - INTERVAL '12 months',now(),INTERVAL '1 minute') as time
  • Kept monitoring egress for about 1 hour
  • Total egress is about 5GB
  • No other database activities

And the test result is:

  • Average CPU usage is around 10% including postgres
  • Average memory usage by the egress collector is about 15M (tcpdump + perl script)
  • Disk space used: <10 bytes
  • EBS IO balance: 99% all the time

@burmecia burmecia merged commit 69ce6ae into develop Jan 25, 2023
@burmecia burmecia deleted the feat/add-pg_egress branch January 25, 2023 04:59
damonrand pushed a commit to cepro/postgres that referenced this pull request Jun 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants