You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the above code snippet, there is a column called 'name' in the dataframe and when executing it an exception is being thrown. Following the stacktrace , it is observed that in line 475 of core/base.py , df.name is being passed to the name argument of
result = Series(result, name=getattr(self, "name", None))
when the dataframe has a column called 'name'.
The same code snippet works fine for any other column name. For example, if we change the column name to nameee. It executes fine.
Stack trace
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~/.virtualenvs/dimensions-connectors/lib/python3.7/site-packages/pandas/core/base.py in _aggregate(self, arg, *args, **kwargs)
470 try:
--> 471 result = DataFrame(result)
472 except ValueError:
~/.virtualenvs/dimensions-connectors/lib/python3.7/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
467 elif isinstance(data, dict):
--> 468 mgr = init_dict(data, index, columns, dtype=dtype)
469 elif isinstance(data, ma.MaskedArray):
~/.virtualenvs/dimensions-connectors/lib/python3.7/site-packages/pandas/core/internals/construction.py in init_dict(data, index, columns, dtype)
282 ]
--> 283 return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
284
~/.virtualenvs/dimensions-connectors/lib/python3.7/site-packages/pandas/core/internals/construction.py in arrays_to_mgr(arrays, arr_names, index, columns, dtype, verify_integrity)
77 if index is None:
---> 78 index = extract_index(arrays)
79 else:
~/.virtualenvs/dimensions-connectors/lib/python3.7/site-packages/pandas/core/internals/construction.py in extract_index(data)
386 if not indexes and not raw_lengths:
--> 387 raise ValueError("If using all scalar values, you must pass an index")
388
ValueError: If using all scalar values, you must pass an index
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
~/.virtualenvs/dimensions-connectors/lib/python3.7/site-packages/pandas/core/frame.py in aggregate(self, func, axis, *args, **kwargs)
7358 try:
-> 7359 result, how = self._aggregate(func, axis=axis, *args, **kwargs)
7360 except TypeError as err:
~/.virtualenvs/dimensions-connectors/lib/python3.7/site-packages/pandas/core/frame.py in _aggregate(self, arg, axis, *args, **kwargs)
7383 return result, how
-> 7384 return super()._aggregate(arg, *args, **kwargs)
7385
~/.virtualenvs/dimensions-connectors/lib/python3.7/site-packages/pandas/core/base.py in _aggregate(self, arg, *args, **kwargs)
474 # we have a dict of scalars
--> 475 result = Series(result, name=getattr(self, "name", None))
476
~/.virtualenvs/dimensions-connectors/lib/python3.7/site-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
228
--> 229 name = ibase.maybe_extract_name(name, data, type(self))
230
~/.virtualenvs/dimensions-connectors/lib/python3.7/site-packages/pandas/core/indexes/base.py in maybe_extract_name(name, obj, cls)
5658 if not is_hashable(name):
-> 5659 raise TypeError(f"{cls.__name__}.name must be a hashable type")
5660
TypeError: Series.name must be a hashable type
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
<ipython-input-6-efe06ed8dce0> in <module>
2 data = {"name": ["abc", "xyz"]}
3 df = pd.DataFrame(data)
----> 4 print(df.agg({'name': 'count'}))
~/.virtualenvs/dimensions-connectors/lib/python3.7/site-packages/pandas/core/frame.py in aggregate(self, func, axis, *args, **kwargs)
7363 f"incompatible data and dtype: {err}"
7364 )
-> 7365 raise exc from err
7366 if result is None:
7367 return self.apply(func, axis=axis, args=args, **kwargs)
TypeError: DataFrame constructor called with incompatible data and dtype: Series.name must be a hashable type
Expected Output
name 2
dtype: int64
Output of pd.show_versions()
INSTALLED VERSIONS
commit : f2ca0a2
python : 3.7.5.final.0
python-bits : 64
OS : Darwin
OS-release : 17.4.0
Version : Darwin Kernel Version 17.4.0: Sun Dec 17 09:19:54 PST 2017; root:xnu-4570.41.2~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
~/.virtualenvs/dimensions-connectors/lib/python3.7/site-packages/pandas/core/base.py in _aggregate(self, arg, *args, **kwargs)
474 # we have a dict of scalars
--> 475 result = Series(result, name=getattr(self, "name", None))
In this case self.name is referencing a Series instead of a hashable type
phofl
added
Apply
Apply, Aggregate, Transform, Map
and removed
Needs Triage
Issue that has not been reviewed by a pandas team member
labels
Sep 8, 2020
~/.virtualenvs/dimensions-connectors/lib/python3.7/site-packages/pandas/core/base.py in _aggregate(self, arg, *args, **kwargs)
474 # we have a dict of scalars
--> 475 result = Series(result, name=getattr(self, "name", None))
In this case self.name is referencing a Series instead of a hashable type
Yeah thats right. Hope this gets fixed soon. For now, m hacking around this for my development. Thanks a lot.
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
Problem description
In the above code snippet, there is a column called 'name' in the dataframe and when executing it an exception is being thrown. Following the stacktrace , it is observed that in line 475 of core/base.py , df.name is being passed to the
name
argument ofresult = Series(result, name=getattr(self, "name", None))
when the dataframe has a column called 'name'.
The same code snippet works fine for any other column name. For example, if we change the column name to
nameee
. It executes fine.Stack trace
Expected Output
name 2
dtype: int64
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : f2ca0a2
python : 3.7.5.final.0
python-bits : 64
OS : Darwin
OS-release : 17.4.0
Version : Darwin Kernel Version 17.4.0: Sun Dec 17 09:19:54 PST 2017; root:xnu-4570.41.2~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.1.1
numpy : 1.18.4
pytz : 2019.3
dateutil : 2.8.1
pip : 20.1
setuptools : 46.1.3
Cython : 0.29.16
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : 0.9.3
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.16.1
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : 0.7.4
fastparquet : None
gcsfs : None
matplotlib : 3.2.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.11
tables : None
tabulate : 0.7.7
xarray : None
xlrd : 1.2.0
xlwt : None
numba : None
The text was updated successfully, but these errors were encountered: