Skip to content

Hook up ReleaseFiles to the cleanup/deletion script #95870

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 22, 2025

Conversation

Swatinem
Copy link
Member

As the ReleaseFile is cross-project, and has no index on the newly added date columns,
I have duplicated the per-project deletion code to rather run per-organization.

I’m not entirely confident in this TBH, and I wonder if this has any tests?

@Swatinem Swatinem requested review from armenzg and a team July 18, 2025 11:16
@Swatinem Swatinem self-assigned this Jul 18, 2025
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jul 18, 2025
cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

Copy link
Member

@jan-auer jan-auer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As cursor points out, the --organization flag is only used for per-org model deletes but not for other models that would technically support filtering by organization. I think this can also be added in a follow-up to keep the change small, but we should be clear in the CLI doc about it. For example "Limit truncation to only entries from organization, applies only to release files".

Copy link

codecov bot commented Jul 18, 2025

Codecov Report

Attention: Patch coverage is 66.66667% with 2 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/db/deletion.py 66.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #95870      +/-   ##
==========================================
+ Coverage   80.83%   80.84%   +0.01%     
==========================================
  Files       10592    10592              
  Lines      610104   610109       +5     
  Branches    23993    23993              
==========================================
+ Hits       493167   493233      +66     
+ Misses     116227   116166      -61     
  Partials      710      710              

Base automatically changed from swatinem/make-releasefiles-tti to master July 21, 2025 10:21
@Swatinem Swatinem requested review from a team as code owners July 21, 2025 10:21
As the `ReleaseFile` is cross-project, and has no index on the newly added date columns,
I have duplicated the per-project deletion code to rather run per-organization.
Copy link
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Bulk Deletion Ignores Organization Filter

When the cleanup command is executed with the --organization flag, the run_bulk_query_deletes function does not filter deletions by the specified organization. This occurs because run_bulk_query_deletes does not accept or pass the organization_id to the BulkDeleteQuery constructor. As a result, models handled by this function (e.g., UserReport, GroupEmailThread) are deleted from all organizations instead of being limited to the target organization.

src/sentry/runner/commands/cleanup.py#L234-L235

run_bulk_query_deletes(bulk_query_deletes, is_filtered, days, project, project_id)

src/sentry/runner/commands/cleanup.py#L246-L253

q = BulkDeleteQuery(
model=model_tp,
dtfield=dtfield,
days=days,
project_id=project_id,
order_by=order_by,
)

src/sentry/runner/commands/cleanup.py#L478-L520

(GroupEmailThread, "date", None),
] + additional_bulk_query_deletes
return BULK_QUERY_DELETES
def run_bulk_query_deletes(
bulk_query_deletes: list[tuple[type[Model], str, str | None]],
is_filtered: Callable[[type[Model]], bool],
days: int,
project: str | None,
project_id: int | None,
) -> None:
from sentry.db.deletion import BulkDeleteQuery
debug_output("Running bulk query deletes in bulk_query_deletes")
for model_tp, dtfield, order_by in bulk_query_deletes:
chunk_size = 10000
debug_output(f"Removing {model_tp.__name__} for days={days} project={project or '*'}")
if is_filtered(model_tp):
debug_output(">> Skipping %s" % model_tp.__name__)
else:
BulkDeleteQuery(
model=model_tp,
dtfield=dtfield,
days=days,
project_id=project_id,
order_by=order_by,
).execute(chunk_size=chunk_size)
def prepare_deletes_by_project(
project: str | None, project_id: int | None, is_filtered: Callable[[type[Model]], bool]
) -> tuple[QuerySet[Any] | None, list[tuple[Any, str, str]]]:
from sentry.constants import ObjectStatus
from sentry.models.debugfile import ProjectDebugFile
from sentry.models.group import Group
from sentry.models.project import Project
# Deletions that we run per project. In some cases we can't use an index on just the date
# column, so as an alternative we use `(project_id, <date_col>)` instead
DELETES_BY_PROJECT = [

Fix in CursorFix in Web


Was this report helpful? Give feedback by reacting with 👍 or 👎

@Swatinem Swatinem merged commit 9534eef into master Jul 22, 2025
65 checks passed
@Swatinem Swatinem deleted the swatinem/cleanup-releasefiles branch July 22, 2025 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants