Skip to content

MultiIndex DataFrame to_csv() terminates python process #26303

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
miburk opened this issue May 7, 2019 · 2 comments · Fixed by #26355
Closed

MultiIndex DataFrame to_csv() terminates python process #26303

miburk opened this issue May 7, 2019 · 2 comments · Fixed by #26355
Labels
Bug MultiIndex Segfault Non-Recoverable Error
Milestone

Comments

@miburk
Copy link

miburk commented May 7, 2019

Code Sample, a copy-pastable example if possible

import traceback
import pandas as pd

# index is actually a MultiIndex
index = pd.Index([(1,), (2,), (3,)])
data = pd.DataFrame([[1, 2, 3]], columns=index)
data = data.reindex(columns=[(1,), (3,)])

try:
    # This call fails with a TypeError.
    data.to_csv('crash.csv')
except TypeError as err:
    traceback.print_exc()

# This print seems to be essential to trigger an immediate crash of the process
# in the following .to_csv call. The crash happens also if the preceding
# try/except block is removed.
print(data)
data.to_csv('crash.csv')

Problem description

The python-code above shows two, maybe three errors.
(1) The first call to DataFrame.to_csv raises a TypeError. This it uninformative at best. Printing the DataFrame works, so it should be possible to export it to csv.
(2) The second call to DataFrame.to_csv leads to terminating python with exit code -1073741819 (0xC0000005).
(3) It seems that the print call triggers this crash, even though I would expect a print to not alter the printed data structure.

The Windows event log shows an error happened in lib\site-packages\pandas_libs\writers.cp36-win_amd64.pyd.

Expected Output

Because the DataFrame is printable on the console, I would expect a successful call of to_csv().

Output of pd.show_versions()

Python Version: Python 3.6.6 (v3.6.6:4cf1f54eb7, Jun 27 2018, 03:37:03) [MSC v.1900 64 bit (AMD64)] on win32

INSTALLED VERSIONS

commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.24.2
pytest: None
pip: 10.0.1
setuptools: 39.1.0
Cython: None
numpy: 1.15.4
scipy: 1.2.0
pyarrow: None
xarray: None
IPython: 7.0.1
sphinx: None
patsy: None
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: 2.6.1
xlrd: 1.2.0
xlwt: None
xlsxwriter: None
lxml.etree: 4.3.0
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@WillAyd
Copy link
Member

WillAyd commented May 8, 2019

Strange indeed, though this segfaulted for me during the try...except block.

Investigation and PRs are always welcome!

@WillAyd WillAyd added Bug MultiIndex Segfault Non-Recoverable Error labels May 8, 2019
@WillAyd WillAyd added this to the Contributions Welcome milestone May 8, 2019
@SummerGram
Copy link

I will do this. Assign it to me please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug MultiIndex Segfault Non-Recoverable Error
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants