Skip to content

feat: support pagination in list_* methods in rest catalog #2158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jayceslesar
Copy link
Contributor

@jayceslesar jayceslesar commented Jun 28, 2025

Closes #2084

Rationale for this change

Support pagination!

Are these changes tested?

Added tests

Are there any user-facing changes?

Yes. Adds a new argument of page_size to each list_* method on the rest catalog and also adds new list_*_raw methods for users to build their own abstractions ontop of in addition to supporting pagination in the existing implementations

@jayceslesar jayceslesar changed the title feat: support pagination in list_* methods in rest catalog feat: support pagination in list_* methods in rest catalog Jun 28, 2025
Comment on lines 623 to 630
list_tables_response = self.list_tables_raw(namespace=namespace, page_size=page_size, next_page_token=next_page_token)
tables.extend([(*table.namespace, table.name) for table in list_tables_response.identifiers])
if list_tables_response.next_page_token is None:
break
else:
next_page_token = list_tables_response.next_page_token

return tables
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we want to return a special Iterable[Identifier] that calls the next page when the current page is exhausted. This avoids pulling in all the Identifiers right away, reducing memory pressure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, I think I can implement something. That will be a breaking change though right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an iter_* method for each listable and opted to just call list(iter) in the existing list_* methods to avoid a breaking change -- If you have suggestions/guidance on how to accomplish both without making a breaking change I will happily implement as that would be much better but I didn't see an obvious way to do it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will be partially breaking, but Iterable is pretty close to List, so I think the community might be okay with the change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is swapping to Iterable what we want to do then? It makes the rest catalog break the interface as every other catalog returns lists -- so would need to change every other catalog to also return an Iterable in the type signature

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rest catalog does not support pagination
3 participants