Skip to content

Different behavior of to_replace method between pandas version 0.23.4 and 0.24.2 by changing dtype of series #25797

@oliverfu89

Description

@oliverfu89

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np

# fails in version 0.24.2 but works in 0.23.4
df = pd.DataFrame.from_dict({'Test': ['0.5', True, '0.6']})
print(df['Test'].dtype)
assert df['Test'].dtype == 'object'
df['Test'] = df['Test'].replace([True], [np.nan])
print(df['Test'].dtype)
assert df['Test'].dtype == 'object'

# fails in version 0.24.2 but works in 0.23.4
df = pd.DataFrame.from_dict({'Test': ['0.5', None, '0.6']})
print(df['Test'].dtype)
assert df['Test'].dtype == 'object'
df['Test'] = df['Test'].replace([None], [np.nan])
print(df['Test'].dtype)
assert df['Test'].dtype == 'object'

# works in both mentioned versions
df = pd.DataFrame.from_dict({'Test': ['0.5', None, '0.6']})
print(df['Test'].dtype)
assert df['Test'].dtype == 'object'
df['Test'] = df['Test'].fillna(np.nan)
print(df['Test'].dtype)
assert df['Test'].dtype == 'object'

Problem description

I have encountered this change of behavior from pandas version 0.23.4 to 0.24.2, but could not find anything in the release notes or the latest documentation, which would inform the user about this change. I am not sure whether the behavioral change of the method is an improvement or not. However, I think it should be mentioned somehow. In addition, since it seems that it is not mentioned, it may hint to a deeper problem of the to_replace method.

For documentation-related issues, you can check the latest versions of the docs on master here:

https://pandas-docs.github.io/pandas-docs-travis/

Expected Output

no assertion errors, dtype of column stays object for all three cases

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None
python: 3.6.7.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.24.2
pytest: 4.3.1
pip: 19.0.3
setuptools: 40.8.0
Cython: None
numpy: 1.15.4
scipy: 1.1.0
pyarrow: 0.12.1
xarray: None
IPython: 7.3.0
sphinx: 1.8.5
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.3
openpyxl: 2.6.1
xlrd: 1.2.0
xlwt: None
xlsxwriter: 1.1.5
lxml.etree: 4.3.2
bs4: None
html5lib: None
sqlalchemy: 1.3.1
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Metadata

Metadata

Assignees

Labels

Dtype ConversionsUnexpected or buggy dtype conversionsMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolateNeeds TestsUnit test(s) needed to prevent regressionsgood first issuereplacereplace method

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions