-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
DEPS: bump pyarrow minimum version from 10.0 to 12.0 #61723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
DEPS: bump pyarrow minimum version from 10.0 to 12.0 #61723
Conversation
@@ -20,11 +19,10 @@ | |||
pa_version_under18p0 = _palv < Version("18.0.0") | |||
pa_version_under19p0 = _palv < Version("19.0.0") | |||
pa_version_under20p0 = _palv < Version("20.0.0") | |||
HAS_PYARROW = True | |||
HAS_PYARROW = _palv >= Version("12.0.1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the current usages of HAS_PYARROW
and essentially everywhere we mean it to be a supported version of pyarrow (didnt check the tests, but those we run only with supported versions anyway).
By changing the definition here, we can use HAS_PYARROW
in other places to protect imports (the ones that were now using if not pa_version_under10p1
), and then we don't have to update those everytime updating the minimum version.
@@ -307,7 +307,7 @@ Dependency Minimum Version pip ex | |||
`PyTables <https://github.com/PyTables/PyTables>`__ 3.8.0 hdf5 HDF5-based reading / writing | |||
`zlib <https://github.com/madler/zlib>`__ hdf5 Compression for HDF5 | |||
`fastparquet <https://github.com/dask/fastparquet>`__ 2024.2.0 - Parquet reading / writing (pyarrow is default) | |||
`pyarrow <https://github.com/apache/arrow>`__ 10.0.1 parquet, feather Parquet, ORC, and feather reading / writing | |||
`pyarrow <https://github.com/apache/arrow>`__ 12.0.1 parquet, feather Parquet, ORC, and feather reading / writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this is for the "Other data sources" section.
As is now also improves performance for a default dtype, should also add to "Performance dependencies (recommended)" section or will this be done in another PR, #61722?
@@ -26,7 +26,6 @@ | |||
from pandas.compat.numpy import is_numpy_dev | |||
from pandas.compat.pyarrow import ( | |||
HAS_PYARROW, | |||
pa_version_under10p1, | |||
pa_version_under11p0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pa_version_under11p0 can be removed also?
For our support window of 2 years, we can bump the minimum pyarrow version to 12.0.1 (see list of release dates here: https://arrow.apache.org/release/, we could also directly bump to 13 assuming the final 3.0 release will happen in 1-2 months).