Description
Most of the data in the data folder are only used for testing and not useful for any examples. However, they add up a considerable amount of space (MBs) relative to the pvlib package itself. For applications such as WASM (code running in the browser), all this clutter slows down loading significantly. The data folder is currently 57 MB out of a total of 63 MB.
Of course, there are data files that we want to keep such as the Linke turbidity and CEC database. However, a rough estimation shows that removing unused data files can reduce the pvlib package but half!
Excluding/including data files can be specified in the pyproject.toml file, for example:
[tool.setuptools.packages.find]
where = ["src"] # list of folders that contain the packages (["."] by default)
include = ["my_package*"] # package names should match these glob patterns (["*"] by default)
exclude = ["my_package.tests*"] # exclude packages matching these glob patterns (empty by default)
This can also be done in the MANIFEST.in file, which is pvlib's current approach, and seems to have finer control.
The pvlib/tests folder can also be excluded, and save 2.5 MB.