-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Bug in Fancy/Boolean Indexing with nested lists #2702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Oh boy. Hitting a bunch of buggy/underspecified NumPy stuff here. I'm having a look but may kick this can down the road |
This is all NumPy behavior. It's going to be too much work for me to fix this anytime soon. I'm already completely fed up with the NumPy library so i would like to overhaul all this mess to make it consistent at some point in the future |
You're right. I just validated the same bugs on a plain ndarray. Do you think there is any value in raising this issue on a NumPy forum? Thanks for looking into these corner cases. Pandas just keeps getting better and I find myself using it more and more when dealing with any non-trivial dataset. |
@jreback is this resolved for pandas now that |
Did I miss something? Series is no longer an NDFrame? |
I will take a look - haven't seen this issue before |
@cpcloud whoops! miswrote - mean no longer an ndarray |
@jtratner No worries! Figured it was something like that....just wanted to stay in the loop! |
This is easy to make all of these act the same, just an extension in so this good (#2745)
else it is converted to a |
Fancy or Boolean indexing on a Series has two strange behaviors. My examples only show the behavior with Fancy indexing, but it's the same for Boolean indexing.
LHS vs RHS length
I would have expected an error, similar to what I get with slice indexing
An even odder behavior is when you have too few items in the RHS
It seems to be using something like itertools.cycle which seems very arbitrary to me
Nested RHS
This may seem like a strange use of pandas, but I need to store Python lists
Very strange. It's like it flattens the input first.
But this flattening only happens if the nested levels are all the same size.
I know in numpy the array constructor would make a distinction between these two inputs, so maybe that's the reason for the difference, but I still don't see why ndarrays are being flattened.
I can work around the issue by converting the RHS to a 1-D array and passing that in.
Slice indexing doesn't have this problem at all
My Question: Are these behaviors a bug or a "feature"? I think Fancy/Boolean indexing should operate the same as slice indexing -- i.e. check for matching lengths and don't auto-convert to numpy array.
The text was updated successfully, but these errors were encountered: