-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Accept multiple lambda in groupby list #26430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Milestone
Comments
2 tasks
There’s an open PR fixing that.
… On Aug 21, 2019, at 05:17, Florian Wetschoreck ***@***.***> wrote:
Thank you for adding this @TomAugspurger
Everything works great for me except for example [4]: df.groupby("A").agg(b=('B', lambda x: 0), c=('B', lambda x: 1))
Which results in the following error:
KeyError Traceback (most recent call last)
<ipython-input-149-3ff87fe40b34> in <module>
1 df = pd.DataFrame({"A": ['a', 'a'], 'B': [1, 2], 'C': [3, 4]})
2 #df.groupby("A").agg({'B': [lambda x: 0, lambda x: 1]})
----> 3 df.groupby("A").agg(b=('B', lambda x: 0), c=('B', lambda x: 1))
4 #df.groupby("A").agg(b=('B', lambda x: 0), c=('C', lambda x: 1))
/usr/local/lib/python3.7/site-packages/pandas/core/groupby/generic.py in aggregate(self, arg, *args, **kwargs)
1453 @appender(_shared_docs["aggregate"])
1454 def aggregate(self, arg=None, *args, **kwargs):
-> 1455 return super().aggregate(arg, *args, **kwargs)
1456
1457 agg = aggregate
/usr/local/lib/python3.7/site-packages/pandas/core/groupby/generic.py in aggregate(self, func, *args, **kwargs)
262
263 if relabeling:
--> 264 result = result[order]
265 result.columns = columns
266
/usr/local/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key)
2979 if is_iterator(key):
2980 key = list(key)
-> 2981 indexer = self.loc._convert_to_indexer(key, axis=1, raise_missing=True)
2982
2983 # take() does not accept boolean indexers
/usr/local/lib/python3.7/site-packages/pandas/core/indexing.py in _convert_to_indexer(self, obj, axis, is_setter, raise_missing)
1269 # When setting, missing keys are not allowed, even with .loc:
1270 kwargs = {"raise_missing": True if is_setter else raise_missing}
-> 1271 return self._get_listlike_indexer(obj, axis, **kwargs)[1]
1272 else:
1273 try:
/usr/local/lib/python3.7/site-packages/pandas/core/indexing.py in _get_listlike_indexer(self, key, axis, raise_missing)
1076
1077 self._validate_read_indexer(
-> 1078 keyarr, indexer, o._get_axis_number(axis), raise_missing=raise_missing
1079 )
1080 return keyarr, indexer
/usr/local/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
1161 raise KeyError(
1162 "None of [{key}] are in the [{axis}]".format(
-> 1163 key=key, axis=self.obj._get_axis_name(axis)
1164 )
1165 )
KeyError: "None of [MultiIndex([('B', '<lambda>'),\n ('B', '<lambda>')],\n )] are in the [columns]"
The new named approach only works for different columns for me. E.g. when I change column B to C in the second aggregation:
df.groupby("A").agg(b=('B', lambda x: 0), c=('C', lambda x: 1))
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We currently don't allow duplicate function names in the list passed too
.groupby().agg({'col': [aggfuncs]})
. This is painful with multiple lambdas, which all have the name<lambda>
I propose that we mangle the names somehow
That adds a
1
,2
, ... to all subsequent lambdas in the same MI level. It doesn't change the first. Do we want<lambda 0>
for the first?As a side-effect, this enables multiple lambdas per column with the new keyword aggregation
I have a WIP started. Will do for 0.25.
The text was updated successfully, but these errors were encountered: