-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
s = pd.Series(['a','b']) # dtype: object
s.mask([True, True], 1) # dtype: object but expected to be int64
Issue Description
NDFrame._where
which is used under the hood of .where
and .mask
returns sometimes unexpected dtype.
To illustrate, the final Series
in the example below is expected to have int64
dtype but it remains a object
.
>>> s = pd.Series(['a','b'])
0 a
1 b
dtype: object
>>> s.mask([True, True], 1)
0 1
1 1
dtype: object
This issue came to our attention during the construction of this PR: #50343 (comment)
Expected Behavior
It is expected for the resulting object to "refresh" its dtype. In the example above, the dtype shall be int64
.
Installed Versions
INSTALLED VERSIONS
commit : 2e218d1
python : 3.10.6.final.0
python-bits : 64
OS : Linux
OS-release : 5.19.0-38-generic
Version : #39~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 17 21:16:15 UTC 2
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.5.3
numpy : 1.23.5
pytz : 2022.7
dateutil : 2.8.2
setuptools : 65.6.3
pip : 22.3.1
Cython : 0.29.32
pytest : 7.2.0
hypothesis : 6.61.0
...
xlrd : 2.0.1
xlwt : None
zstandard : 0.19.0
tzdata : 2022.7