Skip to content

sort=False ignored in Series groupby on MultiIndex levels #9444

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cottrell opened this issue Feb 8, 2015 · 5 comments · Fixed by #9461
Closed

sort=False ignored in Series groupby on MultiIndex levels #9444

cottrell opened this issue Feb 8, 2015 · 5 comments · Fixed by #9461

Comments

@cottrell
Copy link
Contributor

cottrell commented Feb 8, 2015

Example:

import pandas
def geta():
    i = pandas.MultiIndex.from_tuples([(1, 2, 'a', 0),
                              (1, 2, 'a', 1),
                              (1, 1, 'b', 0),
                              (1, 1, 'b', 1),
                              (2, 1, 'b', 0),
                              (2, 1, 'b', 1)], names=['a', 'b', 'c', 'd'])
    a = pandas.Series([0, 1, 2, 3, 4, 5], index=i)
    return(a)

for dosort in [True, False]:
    a = geta()
    b = a.groupby(level=['a', 'b'], sort=dosort).first()
    a = None
    print('%s, sort=%s, \n%s' % (geta.__name__, dosort, b))

Output:

geta, sort=True,
a  b
1  1    2
   2    0
2  1    4
dtype: int64
geta, sort=False,
a  b
1  1    2
   2    0
2  1    4
dtype: int64

Also see: #9076
Possibly related to: #8868

@cottrell
Copy link
Contributor Author

cottrell commented Feb 8, 2015

This seems to fix it. There might be a few more sort args missing elsewhere, for example in groupby._indexer_from_factorized.

diff --git a/pandas/core/groupby.py b/pandas/core/groupby.py
index cb5dedc..06fbb55 100644
--- a/pandas/core/groupby.py
+++ b/pandas/core/groupby.py
@@ -1378,7 +1378,7 @@ class BaseGrouper(object):
         else:
             if len(all_labels) > 1:
                 group_index = get_group_index(all_labels, self.shape)
-                comp_ids, obs_group_ids = _compress_group_index(group_index)
+                comp_ids, obs_group_ids = _compress_group_index(group_index, sort=self.sort)
             else:
                 ping = self.groupings[0]
                 comp_ids = ping.labels

Output after patch:

geta, sort=False,
a  b
geta, sort=True,
a  b
1  1    2
   2    0
2  1    4
dtype: int64
geta, sort=False,
a  b
1  2    0
   1    2
2  1    4
dtype: int64

@jreback
Copy link
Contributor

jreback commented Feb 8, 2015

try on master - iIIRC this all is fixed

@cottrell
Copy link
Contributor Author

cottrell commented Feb 8, 2015

Just tried on master at the commit below and I am seeing no difference.

commit 107cb10
Merge: c4a996a f9031c2
Author: Joris Van den Bossche [email protected]
Date: Sun Feb 8 14:35:47 2015 +0100

Merge pull request #9417 from cel4/plot_error

improved error message for invalid chart types

@jreback
Copy link
Contributor

jreback commented Feb 10, 2015

@cottrell can you do a separate PR for this (with a test), (and a release note). You can then rebase the to_coo (#9076) on top. thxs.

@jreback
Copy link
Contributor

jreback commented Feb 10, 2015

this might interact with #9445

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants