-
Notifications
You must be signed in to change notification settings - Fork 319
feat: support pagination in list_*
methods in rest catalog
#2158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: support pagination in list_*
methods in rest catalog
#2158
Conversation
list_*
methods in rest catalog
pyiceberg/catalog/rest/__init__.py
Outdated
list_tables_response = self.list_tables_raw(namespace=namespace, page_size=page_size, next_page_token=next_page_token) | ||
tables.extend([(*table.namespace, table.name) for table in list_tables_response.identifiers]) | ||
if list_tables_response.next_page_token is None: | ||
break | ||
else: | ||
next_page_token = list_tables_response.next_page_token | ||
|
||
return tables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, we want to return a special Iterable[Identifier]
that calls the next page when the current page is exhausted. This avoids pulling in all the Identifier
s right away, reducing memory pressure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, I think I can implement something. That will be a breaking change though right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an iter_*
method for each listable and opted to just call list(iter) in the existing list_* methods to avoid a breaking change -- If you have suggestions/guidance on how to accomplish both without making a breaking change I will happily implement as that would be much better but I didn't see an obvious way to do it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will be partially breaking, but Iterable
is pretty close to List
, so I think the community might be okay with the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is swapping to Iterable
what we want to do then? It makes the rest catalog break the interface as every other catalog returns lists -- so would need to change every other catalog to also return an Iterable
in the type signature
Closes #2084
Rationale for this change
Support pagination!
Are these changes tested?
Added tests
Are there any user-facing changes?
Yes. Adds a new argument of
page_size
to eachlist_*
method on the rest catalog and also adds newlist_*_raw
methods for users to build their own abstractions ontop of in addition to supporting pagination in the existing implementations