Skip to content

Commit 38f4af9

Browse files
martindurantJulian de Ruiter
andauthored
ENH: add fsspec support (#34266)
Co-authored-by: Julian de Ruiter <[email protected]>
1 parent 506eb54 commit 38f4af9

23 files changed

+279
-250
lines changed

ci/deps/azure-36-locale.yaml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,6 @@ dependencies:
1515

1616
# pandas dependencies
1717
- beautifulsoup4
18-
- gcsfs
1918
- html5lib
2019
- ipython
2120
- jinja2
@@ -31,7 +30,6 @@ dependencies:
3130
- pytables
3231
- python-dateutil
3332
- pytz
34-
- s3fs
3533
- scipy
3634
- xarray
3735
- xlrd

ci/deps/azure-37-locale.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,6 @@ dependencies:
2727
- pytables
2828
- python-dateutil
2929
- pytz
30-
- s3fs
3130
- scipy
3231
- xarray
3332
- xlrd

ci/deps/azure-windows-37.yaml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,8 @@ dependencies:
1515
# pandas dependencies
1616
- beautifulsoup4
1717
- bottleneck
18-
- gcsfs
18+
- fsspec>=0.7.4
19+
- gcsfs>=0.6.0
1920
- html5lib
2021
- jinja2
2122
- lxml
@@ -28,7 +29,7 @@ dependencies:
2829
- pytables
2930
- python-dateutil
3031
- pytz
31-
- s3fs
32+
- s3fs>=0.4.0
3233
- scipy
3334
- sqlalchemy
3435
- xlrd

ci/deps/travis-36-cov.yaml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,8 @@ dependencies:
1818
- cython>=0.29.16
1919
- dask
2020
- fastparquet>=0.3.2
21-
- gcsfs
21+
- fsspec>=0.7.4
22+
- gcsfs>=0.6.0
2223
- geopandas
2324
- html5lib
2425
- matplotlib
@@ -35,7 +36,7 @@ dependencies:
3536
- pytables
3637
- python-snappy
3738
- pytz
38-
- s3fs
39+
- s3fs>=0.4.0
3940
- scikit-learn
4041
- scipy
4142
- sqlalchemy

ci/deps/travis-36-locale.yaml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@ dependencies:
1616
- blosc=1.14.3
1717
- python-blosc
1818
- fastparquet=0.3.2
19-
- gcsfs=0.2.2
2019
- html5lib
2120
- ipython
2221
- jinja2
@@ -33,7 +32,6 @@ dependencies:
3332
- pytables
3433
- python-dateutil
3534
- pytz
36-
- s3fs=0.3.0
3735
- scipy
3836
- sqlalchemy=1.1.4
3937
- xarray=0.10

ci/deps/travis-36-slow.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ dependencies:
1313

1414
# pandas dependencies
1515
- beautifulsoup4
16+
- fsspec>=0.7.4
1617
- html5lib
1718
- lxml
1819
- matplotlib
@@ -25,7 +26,7 @@ dependencies:
2526
- pytables
2627
- python-dateutil
2728
- pytz
28-
- s3fs
29+
- s3fs>=0.4.0
2930
- scipy
3031
- sqlalchemy
3132
- xlrd

ci/deps/travis-37.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,13 @@ dependencies:
1313

1414
# pandas dependencies
1515
- botocore>=1.11
16+
- fsspec>=0.7.4
1617
- numpy
1718
- python-dateutil
1819
- nomkl
1920
- pyarrow
2021
- pytz
21-
- s3fs
22+
- s3fs>=0.4.0
2223
- tabulate
2324
- pyreadstat
2425
- pip

doc/source/getting_started/install.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -267,8 +267,9 @@ SQLAlchemy 1.1.4 SQL support for databases other tha
267267
SciPy 0.19.0 Miscellaneous statistical functions
268268
XLsxWriter 0.9.8 Excel writing
269269
blosc Compression for HDF5
270+
fsspec 0.7.4 Handling files aside from local and HTTP
270271
fastparquet 0.3.2 Parquet reading / writing
271-
gcsfs 0.2.2 Google Cloud Storage access
272+
gcsfs 0.6.0 Google Cloud Storage access
272273
html5lib HTML parser for read_html (see :ref:`note <optional_html>`)
273274
lxml 3.8.0 HTML parser for read_html (see :ref:`note <optional_html>`)
274275
matplotlib 2.2.2 Visualization
@@ -282,7 +283,7 @@ pyreadstat SPSS files (.sav) reading
282283
pytables 3.4.3 HDF5 reading / writing
283284
pyxlsb 1.0.6 Reading for xlsb files
284285
qtpy Clipboard I/O
285-
s3fs 0.3.0 Amazon S3 access
286+
s3fs 0.4.0 Amazon S3 access
286287
tabulate 0.8.3 Printing in Markdown-friendly format (see `tabulate`_)
287288
xarray 0.8.2 pandas-like API for N-dimensional data
288289
xclip Clipboard I/O on linux

doc/source/whatsnew/v1.1.0.rst

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -245,6 +245,22 @@ If needed you can adjust the bins with the argument ``offset`` (a Timedelta) tha
245245

246246
For a full example, see: :ref:`timeseries.adjust-the-start-of-the-bins`.
247247

248+
fsspec now used for filesystem handling
249+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
250+
251+
For reading and writing to filesystems other than local and reading from HTTP(S),
252+
the optional dependency ``fsspec`` will be used to dispatch operations (:issue:`33452`).
253+
This will give unchanged
254+
functionality for S3 and GCS storage, which were already supported, but also add
255+
support for several other storage implementations such as `Azure Datalake and Blob`_,
256+
SSH, FTP, dropbox and github. For docs and capabilities, see the `fsspec docs`_.
257+
258+
The existing capability to interface with S3 and GCS will be unaffected by this
259+
change, as ``fsspec`` will still bring in the same packages as before.
260+
261+
.. _Azure Datalake and Blob: https://github.com/dask/adlfs
262+
263+
.. _fsspec docs: https://filesystem-spec.readthedocs.io/en/latest/
248264

249265
.. _whatsnew_110.enhancements.other:
250266

@@ -701,7 +717,9 @@ Optional libraries below the lowest tested version may still work, but are not c
701717
+-----------------+-----------------+---------+
702718
| fastparquet | 0.3.2 | |
703719
+-----------------+-----------------+---------+
704-
| gcsfs | 0.2.2 | |
720+
| fsspec | 0.7.4 | |
721+
+-----------------+-----------------+---------+
722+
| gcsfs | 0.6.0 | X |
705723
+-----------------+-----------------+---------+
706724
| lxml | 3.8.0 | |
707725
+-----------------+-----------------+---------+
@@ -717,7 +735,7 @@ Optional libraries below the lowest tested version may still work, but are not c
717735
+-----------------+-----------------+---------+
718736
| pytables | 3.4.3 | X |
719737
+-----------------+-----------------+---------+
720-
| s3fs | 0.3.0 | |
738+
| s3fs | 0.4.0 | X |
721739
+-----------------+-----------------+---------+
722740
| scipy | 1.2.0 | X |
723741
+-----------------+-----------------+---------+

environment.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,9 @@ dependencies:
9898

9999
- pyqt>=5.9.2 # pandas.read_clipboard
100100
- pytables>=3.4.3 # pandas.read_hdf, DataFrame.to_hdf
101-
- s3fs # pandas.read_csv... when using 's3://...' path
101+
- s3fs>=0.4.0 # file IO when using 's3://...' path
102+
- fsspec>=0.7.4 # for generic remote file operations
103+
- gcsfs>=0.6.0 # file IO when using 'gcs://...' path
102104
- sqlalchemy # pandas.read_sql, DataFrame.to_sql
103105
- xarray # DataFrame.to_xarray
104106
- cftime # Needed for downstream xarray.CFTimeIndex test

0 commit comments

Comments
 (0)