-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Allow aggregate funcs to return arrays in groupby #3805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for putting that together. But, do you think we could:
rather than raise? |
Hmmm, about the message, we could cover ourselves a bit more and say "Function does not produce aggregated values. Will not be able to optimize and may produce unexpected results." |
What's the difference between |
|
@hayd maybe? I don't know. How do you test that the warning occurred? Also, how about we do even more couched:
|
I think like this (or do you have it working?). There are some PerformanceWarnings in test_pytables, but @jreback seems to ignore (?) them with:
Also, I think we may as well include tests for the current results of those calculations. :) |
I did add some tests like that with |
(haha! I think the experimental bit... is a little OTT!) The usual assertEqual(result, expected) tests :) |
Given there are multiple interpretations of what it means to produce an On Fri, Jun 7, 2013 at 9:25 PM, Andy Hayden [email protected]:
|
@hayd I ignore them so they won't print on the test display; if you don't silence they will... |
Or if you use |
Ah of course! Personally I think that saying "may produce unexpected results" is enough to say we're not really supporting it, and having those assertEquals tests would just highlight when there's a problem and we can then choose to break. :) |
@jtratner that is the right way......I was ignoring them before I knew how to do that! |
Yeah, I just need to fix the tests slightly. |
Actually, I'd say that in general we should probably set the warning filter to |
The issue is that some of these warnings we are explicity testing something in order to generate the warning (like here, the Performance issue).....so you would have to have a try except block around them which would mask an actual problem.... |
Well, if you're testing something to generate the warning, you can do a few things:
I guess 1 is actually better in general. |
Also, @hayd I went with your warning message :) |
The only thing I would mention is that we might as well add tests for the actual results of those two calls. |
oh sorry! totally forgot. I'll add that. On Sat, Jun 8, 2013 at 2:25 PM, Andy Hayden [email protected]:
|
Hmm an additional test to add is the example from the original issue, as I think that is a non ambiguous thing (and after all, it works fine with lists...). Just going through them, none of the test returned
In the first example they "obviously" really want the result (and this would also work for the second in this instance):
I really have no idea what's going on in the third one... unexpected is correct lol. All cases there is a better way I think. I see now what you are saying about ambiguity. So maybe tests for these weird example aren't so important... :S |
Yeah, and it makes the test cases really annoying, because they have to deal with truth values in numpy arrays...ugh. I don't want to unintentionally force us to support weird edge case behavior with groupby. On the upside, this got me to write a nice little test utility to neatly check that something raises a warning. |
Okay, so this now has test cases & such. I'd vote to remove jtratner@d1ce722 (which specifically tests the results), but I wanted to include it if you want it. I also added a new test utility to check for warnings (jtratner@86d56ac ). Given that it is a wrapper around a context manager and that it explicitly changes Doctests on the test utility all pass as well. (not sure if that' s included with the Travis tests). |
Yeah, you're right about removing jtratner@d1ce722 (sorry!). assert_produces_warning looks whizzy! |
Hahaha, I just finished fixing the test case from the SO question. Removing that forever and then pushing. |
Okay, this should all pass + I added notes to |
Never mind. I'm wrong, needed to reset the warning filter each time in |
Did you ever find any way of getting the previous |
@hayd Nope. Not at all - very weird. |
can u post the example that works now (with the warning ) and that didn't work before |
The example from #3788:
As an aside, I just threw in list to see what would happen (this is current behaviour in master):
|
Maybe it should read "Function may not produce aggregated values." lol |
I don't think it's worth it to change the |
I still claim there's no ambiguity in either of those results, imo they both aggregate. |
Looks like tests are failing? What to do here |
@wesm not sure - occurred after I rebased on to the most recent master. Testing for warnings is really finicky. |
Made the test function simpler, hopefully it will just work this time. |
One test failing, and now needs a rebase (maybe worth squashing first). |
Well, I've been trying a bunch of different ways to get it to work and none On Sun, Jun 9, 2013 at 11:58 PM, Andy Hayden [email protected]:
|
…e a PerformanceWarning instead warns that a non-aggregating function will result in degraded performance TST: Catch warnings in test_groupby
@hayd can you run this and see what you get? jtratner@dfb13e0 . I can't reproduce the error that Travis CI gets so it's difficult for me to fix it. |
I can't reproduce either... |
I'm thinking it's a Travis error (if you notice, a few other warnings On Mon, Jun 10, 2013 at 6:33 AM, Andy Hayden [email protected]:
|
I managed to reproduce this on my computer in 2.6! Now hopefully I can figure out why... |
Update part 2: never mind, it happened two times on my system and then never failed again - I'm trying to rerun an old commit that worked and then see whether it fails when rebased onto the newest master. |
@jtratner how's this coming? |
@jreback I tried testing the failing example in travis on master - turns On Wed, Jun 12, 2013 at 10:10 AM, jreback [email protected] wrote:
|
@jreback I've been watching the tracing and have learned some things: for example, the |
ok....yeh lots of type inference operations that groupby needs....lmk |
are we pushing this to 0.12 for consideratino? |
That seems reasonable to me, it's relatively low importance and not worth being a blocker for the release. |
If someone else wants to pick this up, go ahead. Going to close - I don't think it's worth the time for an edge case like this. |
fixes #3788
Please check out whether you like the error message for
Performance Warning
.Also, I'm not sure whether this means that groupby fails under certain conditions and not others (like when trying Cython, etc.).