Description
The functionality of the apply(func, axis=1) in the newly released pandas==1.1.0 is not working as expected.
Pandas seems to be overwriting all the rows in the data frame with the 1st row present. This is happening specifically when introducing a new column in the data frame when running func method on each of the rows.
This working in pandas==1.0.5, but seems to be a bug in pandas=1.10.
I am attaching a sample script and the logs captured for pandas==1.0.5 and pandas==1.10.
Attachments:
-
sample script to reproduce the issue (rename to .py before running) --> script.txt
-
output 1 (pandas==1.0.5) - working as expected --> out_pandas_1.0.5.log
-
output 2 (pandas==1.1.0) - buggy --> out_pandas_1.1.0.log
As you can see in the out_pandas_1.1.0.log log, after preprocessing the data frame using df = df.apply(process_text, axis=1)
all the rows in the data frame have been overwritten with the 1st row.
This was not the case with pandas==1.0.5, check the out_pandas_1.0.5.log log.
Environment
- OS: Ubuntu 20.04
- Python: 3.7.7 (anaconda env)