Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
import pandas as pd
from typing import Tuple
# sample data
data = {'Date': [pd.Timestamp('2019-01-31 00:00:00'), pd.Timestamp('2019-02-01 00:00:00'), pd.Timestamp('2019-02-04 00:00:00'), pd.Timestamp('2019-02-05 00:00:00'), pd.Timestamp('2019-02-06 00:00:00'), pd.Timestamp('2019-02-07 00:00:00'), pd.Timestamp('2019-02-08 00:00:00'), pd.Timestamp('2019-02-11 00:00:00'), pd.Timestamp('2019-02-12 00:00:00'), pd.Timestamp('2019-02-13 00:00:00'), pd.Timestamp('2019-02-14 00:00:00')],
'Close': [166.44000244140625, 166.52000427246094, 171.25, 174.17999267578125, 174.24000549316406, 170.94000244140625, 170.41000366210938, 169.42999267578125, 170.88999938964844, 170.17999267578125, 170.8000030517578]}
# create dataframe
aapl = pd.DataFrame(data)
def find_trend(data: pd.DataFrame, period: int) -> Tuple[pd.Series, pd.Series]:
data['sma'] = data['Close'].rolling(period).mean() # this creates an inplace update to aapl
diff = data['sma'] - data['sma'].shift(1) # calculates a series of values
greater_than_0 = diff > 0 # creates a series of bools
return diff, greater_than_0
aapl['value'], aapl['trend'] = find_trend(aapl, 4)
Current Output
- Note the creation of the
sma
column - Is this the expected behavior?
| | Date | Close | sma | value | trend |
|---:|:--------------------|--------:|--------:|------------:|:--------|
| 0 | 2019-01-31 00:00:00 | 166.44 | nan | nan | False |
| 1 | 2019-02-01 00:00:00 | 166.52 | nan | nan | False |
| 2 | 2019-02-04 00:00:00 | 171.25 | nan | nan | False |
| 3 | 2019-02-05 00:00:00 | 174.18 | 169.597 | nan | False |
| 4 | 2019-02-06 00:00:00 | 174.24 | 171.548 | 1.95 | True |
| 5 | 2019-02-07 00:00:00 | 170.94 | 172.653 | 1.105 | True |
| 6 | 2019-02-08 00:00:00 | 170.41 | 172.443 | -0.209999 | False |
| 7 | 2019-02-11 00:00:00 | 169.43 | 171.255 | -1.1875 | False |
| 8 | 2019-02-12 00:00:00 | 170.89 | 170.417 | -0.837502 | False |
| 9 | 2019-02-13 00:00:00 | 170.18 | 170.227 | -0.190002 | False |
| 10 | 2019-02-14 00:00:00 | 170.8 | 170.325 | 0.0974998 | True |
Problem description
- I do not expect
data['sma'] = data['Close'].rolling(period).mean()
to create an inplace update toaapl
- This seems related to:
Expected Output
| | Date | Close | value | trend |
|---:|:--------------------|--------:|------------:|:--------|
| 0 | 2019-01-31 00:00:00 | 166.44 | nan | False |
| 1 | 2019-02-01 00:00:00 | 166.52 | nan | False |
| 2 | 2019-02-04 00:00:00 | 171.25 | nan | False |
| 3 | 2019-02-05 00:00:00 | 174.18 | nan | False |
| 4 | 2019-02-06 00:00:00 | 174.24 | 1.95 | True |
| 5 | 2019-02-07 00:00:00 | 170.94 | 1.105 | True |
| 6 | 2019-02-08 00:00:00 | 170.41 | -0.209999 | False |
| 7 | 2019-02-11 00:00:00 | 169.43 | -1.1875 | False |
| 8 | 2019-02-12 00:00:00 | 170.89 | -0.837502 | False |
| 9 | 2019-02-13 00:00:00 | 170.18 | -0.190002 | False |
| 10 | 2019-02-14 00:00:00 | 170.8 | 0.0974998 | True |
Resolves Issue
- I know changing the function, as follows, will result in not creating an inplace update to
aapl
def find_trend(data: pd.DataFrame, period: int) -> Tuple[pd.Series, pd.Series]:
sma = data['Close'].rolling(period).mean() # does not create an inplace update to aapl
diff = sma - sma.shift(1) # calculates a series of values
greater_than_0 = diff > 0 # creates a series of bools
return diff, greater_than_0
Output of pd.show_versions()
INSTALLED VERSIONS
commit : d9fff27
python : 3.8.5.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 1.1.0
numpy : 1.19.1
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.2
setuptools : 49.6.0.post20200814
Cython : 0.29.21
pytest : 6.0.1
hypothesis : None
sphinx : 3.2.1
blosc : None
feather : None
xlsxwriter : 1.2.9
lxml.etree : 4.5.2
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.16.1
pandas_datareader: 0.9.0
bs4 : 4.9.1
bottleneck : 1.3.2
fsspec : 0.8.0
fastparquet : None
gcsfs : None
matplotlib : 3.3.1
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.4
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.5.0
sqlalchemy : 1.3.18
tables : 3.6.1
tabulate : 0.8.7
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
numba : 0.50.1