Skip to content

.diff(axis=1) gives NaNs with different types. #21437

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eoincondron opened this issue Jun 12, 2018 · 3 comments
Closed

.diff(axis=1) gives NaNs with different types. #21437

eoincondron opened this issue Jun 12, 2018 · 3 comments
Labels
Dtype Conversions Unexpected or buggy dtype conversions Multi-Block Issues caused by the presence of multiple Blocks Numeric Operations Arithmetic, Comparison, and Logical operations

Comments

@eoincondron
Copy link

df = pd.DataFrame({'a': np.arange(3, dtype='float32'), 'b': np.arange(3, dtype='float64')})
df.diff(axis=1)

	a	b
0	NaN	NaN
1	NaN	NaN
2	NaN	NaN

df = pd.DataFrame({'a': np.arange(3, dtype='int32'), 'b': np.arange(3, dtype='int64')})
df.diff(axis=1)
	a	b
0	NaN	NaN
1	NaN	NaN
2	NaN	NaN

Problem description

When diffing across the column axis with different numeric types in the columns we get
Not sure if this is intentional behaviour but given that df.a - df.b works as expected, I would think that .diff does the same. I think it should at least emit a warning when the types differ.

Expected Output

	a	b
0	NaN	0
1	NaN	0
2	NaN	0

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-327.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8

pandas: 0.20.3
pytest: 3.2.1
pip: 9.0.1
setuptools: 36.4.0
Cython: None
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.8
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
sqlalchemy: 1.1.13
pymysql: 0.7.9.None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

@WillAyd
Copy link
Member

WillAyd commented Jun 12, 2018

Fair callout. I suppose the below works so should probably work for axis=1 as well:

In [9]: df.T.diff()
Out[9]: 
     0    1    2
a  NaN  NaN  NaN
b  0.0  0.0  0.0

Investigation and PRs welcome!

@WillAyd WillAyd added the Dtype Conversions Unexpected or buggy dtype conversions label Jun 12, 2018
@aschade92
Copy link

Taking a look!

@jbrockmendel jbrockmendel added Numeric Operations Arithmetic, Comparison, and Logical operations Multi-Block Issues caused by the presence of multiple Blocks labels Sep 21, 2020
@jbrockmendel
Copy link
Member

closed by #32995

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Multi-Block Issues caused by the presence of multiple Blocks Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

No branches or pull requests

4 participants