Skip to content

Reading zipped utf-16 file: AttributeError: 'UTF8Recoder' object has no attribute 'seek' #18071

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nickeubank opened this issue Nov 2, 2017 · 4 comments · Fixed by #18091
Closed
Labels
Bug IO CSV read_csv, to_csv
Milestone

Comments

@nickeubank
Copy link
Contributor

Code Sample, a copy-pastable example if possible

df = pd.read_table('myfile.zip', 
                   compression='zip',
                   encoding='utf-16')

Problem description

  Traceback (most recent call last):

 File "<ipython-input-4-6daea174e70a>", line 3, in <module>
  encoding='utf-16')

File "/Users/Nick/anaconda/lib/python3.5/site-packages/pandas/io/parsers.py", line 705, in parser_f
  return _read(filepath_or_buffer, kwds)

File "/Users/Nick/anaconda/lib/python3.5/site-packages/pandas/io/parsers.py", line 445, in _read
  parser = TextFileReader(filepath_or_buffer, **kwds)

File "/Users/Nick/anaconda/lib/python3.5/site-packages/pandas/io/parsers.py", line 814, in __init__
  self._make_engine(self.engine)

File "/Users/Nick/anaconda/lib/python3.5/site-packages/pandas/io/parsers.py", line 1045, in _make_engine
  self._engine = CParserWrapper(self.f, **self.options)

File "/Users/Nick/anaconda/lib/python3.5/site-packages/pandas/io/parsers.py", line 1684, in __init__
  self._reader = parsers.TextReader(src, **kwds)

File "pandas/_libs/parsers.pyx", line 391, in pandas._libs.parsers.TextReader.__cinit__

File "pandas/_libs/parsers.pyx", line 664, in pandas._libs.parsers.TextReader._setup_parser_source

File "/Users/Nick/anaconda/lib/python3.5/zipfile.py", line 1026, in __init__
  self._RealGetContents()

File "/Users/Nick/anaconda/lib/python3.5/zipfile.py", line 1090, in _RealGetContents
  endrec = _EndRecData(fp)

File "/Users/Nick/anaconda/lib/python3.5/zipfile.py", line 241, in _EndRecData
  fpin.seek(0, 2)

 AttributeError: 'UTF8Recoder' object has no attribute 'seek'

Expected Output

A table!

Output of pd.show_versions()

pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.21.0
pytest: 3.0.7
pip: 9.0.1
setuptools: 33.1.1.post20170320
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 5.3.0
sphinx: 1.5.4
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.3.0
numexpr: 2.6.2
feather: None
matplotlib: 1.5.1
openpyxl: 2.4.5
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.3
bs4: 4.5.3
html5lib: 0.999
sqlalchemy: 1.1.5
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.9.5
s3fs: 0.0.9
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@gfyoung gfyoung added the IO CSV read_csv, to_csv label Nov 2, 2017
@gfyoung
Copy link
Member

gfyoung commented Nov 2, 2017

Hmmm...that looks kind of odd. Mind sharing the ZIP file for us to reproduce?

@nickeubank
Copy link
Contributor Author

@gfyoung sorry for delay -- couldn't upload original file, but here's what seems to be a minimal working example

sample.zip

@gfyoung
Copy link
Member

gfyoung commented Nov 2, 2017

Yep, can reproduce this error. Investigation and PR patch is welcome!

@nickeubank
Copy link
Contributor Author

Thanks @Licht-T ! Much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants