Description
Code Sample, a copy-pastable example
import numpy as np
import pandas as pd
import datetime as dt
data = np.array([0.1552, 0.1746, 0.1932, 0.234 , np.nan, 0.2423, 0.1648,
np.nan, 0.2148, 0.2081, 0.2313, 0.2011, np.nan, 0.2076,
0.2096, np.nan, 0.1801, 0.1872, 0.1878, 0.1949, np.nan,
0.1608, np.nan, np.nan, 0.1793, np.nan, 0.1689, 0.1631,
np.nan, 0.1586, 0.1531, np.nan, 0.149 , 0.1434, np.nan,
0.1526, np.nan, 0.1293, 0.1268, np.nan])
dates_pandas = pd.date_range(start=dt.date(2020, 3, 24), periods=data.shape[0])
ser = pd.Series(data, index=dates_pandas)
full_mean = ser.rolling(window="20D",
min_periods=2,
center=False,
closed='both').mean()
shorter_mean = ser['2020-03-25':].rolling(window="20D",
min_periods=2,
center=False,
closed='both').mean()
print(full_mean['2020-05-02'], full_mean['2020-05-02'].round(4))
print(shorter_mean['2020-05-02'], shorter_mean['2020-05-02'].round(4))
Output
0.15664999999999998 0.1566
0.15665000000000004 0.1567
Problem description
The rolling mean in the example should only take the last 20 values into account. The output for the last day 2020-05-02
does however depend on the inclusion of a value on 2020-03-24
.
This is especially visible if we round the output to the 4 digit precision that the input data has.
Expected Output
Both running means should be the same value.
Output of pd.show_versions()
INSTALLED VERSIONS
commit : None
python : 3.6.7.final.0
python-bits : 64
OS : Linux
OS-release : 5.6.12-arch1-1
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.0.3
numpy : 1.17.3
pytz : 2019.3
dateutil : 2.8.1
pip : 19.3.1
setuptools : 41.6.0.post20191101
Cython : 0.29.13
pytest : 5.3.0
hypothesis : None
sphinx : 2.2.1
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.4.1
html5lib : None
pymysql : None
psycopg2 : 2.8.4 (dt dec pq3 ext lo64)
jinja2 : 2.10.3
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.4.1
matplotlib : 3.1.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.3.0
pyxlsb : None
s3fs : None
scipy : 1.3.2
sqlalchemy : 1.3.17
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : None
None