Skip to content

Commit 502db03

Browse files
author
Matt Roeschke
committed
Merge remote-tracking branch 'upstream/master'
2 parents 8453230 + a4c19e7 commit 502db03

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

59 files changed

+689
-316
lines changed

ci/azure/posix.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,12 @@ jobs:
3333
PATTERN: "not slow and not network"
3434
LOCALE_OVERRIDE: "it_IT.UTF-8"
3535

36+
py36_32bit:
37+
ENV_FILE: ci/deps/azure-36-32bit.yaml
38+
CONDA_PY: "36"
39+
PATTERN: "not slow and not network"
40+
BITS32: "yes"
41+
3642
py37_locale:
3743
ENV_FILE: ci/deps/azure-37-locale.yaml
3844
CONDA_PY: "37"

ci/deps/azure-36-32bit.yaml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
name: pandas-dev
2+
channels:
3+
- defaults
4+
- conda-forge
5+
dependencies:
6+
- gcc_linux-32
7+
- gcc_linux-32
8+
- gxx_linux-32
9+
- cython=0.28.2
10+
- numpy=1.14.*
11+
- python-dateutil
12+
- python=3.6.*
13+
- pytz=2017.2
14+
# universal
15+
- pytest>=4.0.2,<5.0.0
16+
- pytest-xdist
17+
- pytest-mock
18+
- pytest-azurepipelines
19+
- hypothesis>=3.58.0
20+
- pip

ci/deps/azure-37-numpydev.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,4 +17,5 @@ dependencies:
1717
- "--pre"
1818
- "numpy"
1919
- "scipy"
20-
- pytest-azurepipelines
20+
# https://github.com/pandas-dev/pandas/issues/27421
21+
- pytest-azurepipelines<1.0.0

ci/deps/azure-macos-35.yaml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,4 +29,6 @@ dependencies:
2929
- pytest-xdist
3030
- pytest-mock
3131
- hypothesis>=3.58.0
32-
- pytest-azurepipelines
32+
# https://github.com/pandas-dev/pandas/issues/27421
33+
- pytest-azurepipelines<1.0.0
34+

ci/setup_env.sh

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,12 @@ echo
9494
echo "conda env create -q --file=${ENV_FILE}"
9595
time conda env create -q --file="${ENV_FILE}"
9696

97+
98+
if [[ "$BITS32" == "yes" ]]; then
99+
# activate 32-bit compiler
100+
export CONDA_BUILD=1
101+
fi
102+
97103
echo "activate pandas-dev"
98104
source activate pandas-dev
99105

doc/source/development/contributing.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -288,7 +288,7 @@ complex changes to the documentation as well.
288288
Some other important things to know about the docs:
289289

290290
* The *pandas* documentation consists of two parts: the docstrings in the code
291-
itself and the docs in this folder ``pandas/doc/``.
291+
itself and the docs in this folder ``doc/``.
292292

293293
The docstrings provide a clear explanation of the usage of the individual
294294
functions, while the documentation in this folder consists of tutorial-like
@@ -404,11 +404,11 @@ Building the documentation
404404
~~~~~~~~~~~~~~~~~~~~~~~~~~
405405

406406
So how do you build the docs? Navigate to your local
407-
``pandas/doc/`` directory in the console and run::
407+
``doc/`` directory in the console and run::
408408

409409
python make.py html
410410

411-
Then you can find the HTML output in the folder ``pandas/doc/build/html/``.
411+
Then you can find the HTML output in the folder ``doc/build/html/``.
412412

413413
The first time you build the docs, it will take quite a while because it has to run
414414
all the code examples and build all the generated docstring pages. In subsequent
@@ -448,7 +448,7 @@ You can also specify to use multiple cores to speed up the documentation build::
448448
Open the following file in a web browser to see the full documentation you
449449
just built::
450450

451-
pandas/docs/build/html/index.html
451+
doc/build/html/index.html
452452

453453
And you'll have the satisfaction of seeing your new and improved documentation!
454454

doc/source/reference/extensions.rst

Lines changed: 35 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,44 @@ objects.
1818
api.extensions.register_series_accessor
1919
api.extensions.register_index_accessor
2020
api.extensions.ExtensionDtype
21-
api.extensions.ExtensionArray
2221

2322
.. autosummary::
2423
:toctree: api/
2524
:template: autosummary/class_without_autosummary.rst
2625

26+
api.extensions.ExtensionArray
2727
arrays.PandasArray
28+
29+
.. We need this autosummary so that methods and attributes are generated.
30+
.. Separate block, since they aren't classes.
31+
32+
.. autosummary::
33+
:toctree: api/
34+
35+
api.extensions.ExtensionArray._concat_same_type
36+
api.extensions.ExtensionArray._formatter
37+
api.extensions.ExtensionArray._formatting_values
38+
api.extensions.ExtensionArray._from_factorized
39+
api.extensions.ExtensionArray._from_sequence
40+
api.extensions.ExtensionArray._from_sequence_of_strings
41+
api.extensions.ExtensionArray._ndarray_values
42+
api.extensions.ExtensionArray._reduce
43+
api.extensions.ExtensionArray._values_for_argsort
44+
api.extensions.ExtensionArray._values_for_factorize
45+
api.extensions.ExtensionArray.argsort
46+
api.extensions.ExtensionArray.astype
47+
api.extensions.ExtensionArray.copy
48+
api.extensions.ExtensionArray.dropna
49+
api.extensions.ExtensionArray.factorize
50+
api.extensions.ExtensionArray.fillna
51+
api.extensions.ExtensionArray.isna
52+
api.extensions.ExtensionArray.ravel
53+
api.extensions.ExtensionArray.repeat
54+
api.extensions.ExtensionArray.searchsorted
55+
api.extensions.ExtensionArray.shift
56+
api.extensions.ExtensionArray.take
57+
api.extensions.ExtensionArray.unique
58+
api.extensions.ExtensionArray.dtype
59+
api.extensions.ExtensionArray.nbytes
60+
api.extensions.ExtensionArray.ndim
61+
api.extensions.ExtensionArray.shape

doc/source/whatsnew/v0.25.0.rst

Lines changed: 48 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -400,7 +400,7 @@ of ``object`` dtype. :attr:`Series.str` will now infer the dtype data *within* t
400400
.. _whatsnew_0250.api_breaking.groupby_categorical:
401401

402402
Categorical dtypes are preserved during groupby
403-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
403+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
404404

405405
Previously, columns that were categorical, but not the groupby key(s) would be converted to ``object`` dtype during groupby operations. Pandas now will preserve these dtypes. (:issue:`18502`)
406406

@@ -740,6 +740,47 @@ consistent with NumPy and the rest of pandas (:issue:`21801`).
740740
cat.argsort()
741741
cat[cat.argsort()]
742742
743+
.. _whatsnew_0250.api_breaking.list_of_dict:
744+
745+
Column order is preserved when passing a list of dicts to DataFrame
746+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
747+
748+
Starting with Python 3.7 the key-order of ``dict`` is `guaranteed <https://mail.python.org/pipermail/python-dev/2017-December/151283.html>`_. In practice, this has been true since
749+
Python 3.6. The :class:`DataFrame` constructor now treats a list of dicts in the same way as
750+
it does a list of ``OrderedDict``, i.e. preserving the order of the dicts.
751+
This change applies only when pandas is running on Python>=3.6 (:issue:`27309`).
752+
753+
.. ipython:: python
754+
755+
data = [
756+
{'name': 'Joe', 'state': 'NY', 'age': 18},
757+
{'name': 'Jane', 'state': 'KY', 'age': 19, 'hobby': 'Minecraft'},
758+
{'name': 'Jean', 'state': 'OK', 'age': 20, 'finances': 'good'}
759+
]
760+
761+
*Previous Behavior*:
762+
763+
The columns were lexicographically sorted previously,
764+
765+
.. code-block:: python
766+
767+
In [1]: pd.DataFrame(data)
768+
Out[1]:
769+
age finances hobby name state
770+
0 18 NaN NaN Joe NY
771+
1 19 NaN Minecraft Jane KY
772+
2 20 good NaN Jean OK
773+
774+
*New Behavior*:
775+
776+
The column order now matches the insertion-order of the keys in the ``dict``,
777+
considering all the records from top to bottom. As a consequence, the column
778+
order of the resulting DataFrame has changed compared to previous pandas verisons.
779+
780+
.. ipython:: python
781+
782+
pd.DataFrame(data)
783+
743784
.. _whatsnew_0250.api_breaking.deps:
744785
745786
Increased minimum versions for dependencies
@@ -939,6 +980,7 @@ Performance improvements
939980
- Improved performance by removing the need for a garbage collect when checking for ``SettingWithCopyWarning`` (:issue:`27031`)
940981
- For :meth:`to_datetime` changed default value of cache parameter to ``True`` (:issue:`26043`)
941982
- Improved performance of :class:`DatetimeIndex` and :class:`PeriodIndex` slicing given non-unique, monotonic data (:issue:`27136`).
983+
- Improved performance of :meth:`pd.read_json` for index-oriented data. (:issue:`26773`)
942984
943985
.. _whatsnew_0250.bug_fixes:
944986
@@ -995,6 +1037,7 @@ Timezones
9951037
- Bug in :func:`DataFrame.join` where joining a timezone aware index with a timezone aware column would result in a column of ``NaN`` (:issue:`26335`)
9961038
- Bug in :func:`date_range` where ambiguous or nonexistent start or end times were not handled by the ``ambiguous`` or ``nonexistent`` keywords respectively (:issue:`27088`)
9971039
- Bug in :meth:`DatetimeIndex.union` when combining a timezone aware and timezone unaware :class:`DatetimeIndex` (:issue:`21671`)
1040+
- Bug when applying a numpy reduction function (e.g. :meth:`numpy.minimum`) to a timezone aware :class:`Series` (:issue:`15552`)
9981041
9991042
Numeric
10001043
^^^^^^^
@@ -1054,7 +1097,8 @@ Indexing
10541097
- Bug in :class:`CategoricalIndex` and :class:`Categorical` incorrectly raising ``ValueError`` instead of ``TypeError`` when a list is passed using the ``in`` operator (``__contains__``) (:issue:`21729`)
10551098
- Bug in setting a new value in a :class:`Series` with a :class:`Timedelta` object incorrectly casting the value to an integer (:issue:`22717`)
10561099
- Bug in :class:`Series` setting a new key (``__setitem__``) with a timezone-aware datetime incorrectly raising ``ValueError`` (:issue:`12862`)
1057-
-
1100+
- Bug in :meth:`DataFrame.iloc` when indexing with a read-only indexer (:issue:`17192`)
1101+
- Bug in :class:`Series` setting an existing tuple key (``__setitem__``) with timezone-aware datetime values incorrectly raising ``TypeError`` (:issue:`20441`)
10581102
10591103
Missing
10601104
^^^^^^^
@@ -1086,7 +1130,6 @@ I/O
10861130
- Bug in :meth:`DataFrame.to_html` where header numbers would ignore display options when rounding (:issue:`17280`)
10871131
- Bug in :func:`read_hdf` where reading a table from an HDF5 file written directly with PyTables fails with a ``ValueError`` when using a sub-selection via the ``start`` or ``stop`` arguments (:issue:`11188`)
10881132
- Bug in :func:`read_hdf` not properly closing store after a ``KeyError`` is raised (:issue:`25766`)
1089-
- Bug in ``read_csv`` which would not raise ``ValueError`` if a column index in ``usecols`` was out of bounds (:issue:`25623`)
10901133
- Improved the explanation for the failure when value labels are repeated in Stata dta files and suggested work-arounds (:issue:`25772`)
10911134
- Improved :meth:`pandas.read_stata` and :class:`pandas.io.stata.StataReader` to read incorrectly formatted 118 format files saved by Stata (:issue:`25960`)
10921135
- Improved the ``col_space`` parameter in :meth:`DataFrame.to_html` to accept a string so CSS length values can be set correctly (:issue:`25941`)
@@ -1103,6 +1146,7 @@ I/O
11031146
- Bug in :meth:`read_hdf` where reading a timezone aware :class:`DatetimeIndex` would raise a ``TypeError`` (:issue:`11926`)
11041147
- Bug in :meth:`to_msgpack` and :meth:`read_msgpack` which would raise a ``ValueError`` rather than a ``FileNotFoundError`` for an invalid path (:issue:`27160`)
11051148
- Fixed bug in :meth:`DataFrame.to_parquet` which would raise a ``ValueError`` when the dataframe had no columns (:issue:`27339`)
1149+
- Allow parsing of :class:`PeriodDtype` columns when using :func:`read_csv` (:issue:`26934`)
11061150
11071151
Plotting
11081152
^^^^^^^^
@@ -1111,7 +1155,7 @@ Plotting
11111155
- Bug in an error message in :meth:`DataFrame.plot`. Improved the error message if non-numerics are passed to :meth:`DataFrame.plot` (:issue:`25481`)
11121156
- Bug in incorrect ticklabel positions when plotting an index that are non-numeric / non-datetime (:issue:`7612`, :issue:`15912`, :issue:`22334`)
11131157
- Fixed bug causing plots of :class:`PeriodIndex` timeseries to fail if the frequency is a multiple of the frequency rule code (:issue:`14763`)
1114-
-
1158+
- Fixed bug when plotting a :class:`DatetimeIndex` with ``datetime.timezone.utc`` timezone (:issue:`17173`)
11151159
-
11161160
-
11171161

mypy.ini

Lines changed: 0 additions & 6 deletions
This file was deleted.

pandas/_libs/algos_take_helper.pxi.in

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -148,7 +148,7 @@ def get_dispatch(dtypes):
148148
@cython.wraparound(False)
149149
@cython.boundscheck(False)
150150
cdef inline take_1d_{{name}}_{{dest}}_memview({{c_type_in}}[:] values,
151-
int64_t[:] indexer,
151+
const int64_t[:] indexer,
152152
{{c_type_out}}[:] out,
153153
fill_value=np.nan):
154154

@@ -159,7 +159,7 @@ cdef inline take_1d_{{name}}_{{dest}}_memview({{c_type_in}}[:] values,
159159
@cython.wraparound(False)
160160
@cython.boundscheck(False)
161161
def take_1d_{{name}}_{{dest}}(ndarray[{{c_type_in}}, ndim=1] values,
162-
int64_t[:] indexer,
162+
const int64_t[:] indexer,
163163
{{c_type_out}}[:] out,
164164
fill_value=np.nan):
165165

@@ -178,7 +178,7 @@ def take_1d_{{name}}_{{dest}}(ndarray[{{c_type_in}}, ndim=1] values,
178178
@cython.wraparound(False)
179179
@cython.boundscheck(False)
180180
cdef inline take_2d_axis0_{{name}}_{{dest}}_memview({{c_type_in}}[:, :] values,
181-
int64_t[:] indexer,
181+
const int64_t[:] indexer,
182182
{{c_type_out}}[:, :] out,
183183
fill_value=np.nan):
184184
{{inner_take_2d_axis0}}
@@ -205,7 +205,7 @@ def take_2d_axis0_{{name}}_{{dest}}(ndarray[{{c_type_in}}, ndim=2] values,
205205
@cython.wraparound(False)
206206
@cython.boundscheck(False)
207207
cdef inline take_2d_axis1_{{name}}_{{dest}}_memview({{c_type_in}}[:, :] values,
208-
int64_t[:] indexer,
208+
const int64_t[:] indexer,
209209
{{c_type_out}}[:, :] out,
210210
fill_value=np.nan):
211211
{{inner_take_2d_axis1}}

0 commit comments

Comments
 (0)