Skip to content

BUG: Categoricals cannot compare to ndim=0 array values #8658

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
immerrr opened this issue Oct 28, 2014 · 6 comments · Fixed by #8706
Closed

BUG: Categoricals cannot compare to ndim=0 array values #8658

immerrr opened this issue Oct 28, 2014 · 6 comments · Fixed by #8706
Labels
Bug Categorical Categorical Data Type
Milestone

Comments

@immerrr
Copy link
Contributor

immerrr commented Oct 28, 2014

Reproduction steps:

In [1]: cat = pd.Categorical([1,2,3])

In [2]: cat > cat[0]
Out[2]: array([False,  True,  True], dtype=bool)

In [3]: cat[0] < cat
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-5c4bb0678680> in <module>()
----> 1 cat[0] < cat

/home/dshpektorov/sources/pandas/pandas/core/categorical.pyc in f(self, other)
     52             msg = "Cannot compare a Categorical for op {op} with type {typ}. If you want to \n" \
     53                   "compare values, use 'np.asarray(cat) <op> other'."
---> 54             raise TypeError(msg.format(op=op,typ=type(other)))
     55 
     56     f.__name__ = op

TypeError: Cannot compare a Categorical for op __gt__ with type <type 'numpy.ndarray'>. If you want to compare values, use 'np.asarray(cat) <op> other'.

The problem is that when cat[0] is the first operand (type == np.int64) it orchestrates the execution and the first step is casting it to a rank-0 array (it becomes array(1) rather than np.int64(1)) and at that point it ceases to pass np.isscalar check and corresponding branch of execution is skipped.

The fix is trivial, feel free to beat me to it.

On a broader scale, we might want to review all uses of np.isscalar and see if they can also benefit from accepting rank-0 arrays. If that's true, we might want to add pd.isscalar function and use it everywhere instead.

@immerrr immerrr changed the title BUG: categorical values cannot compare to ndim=0 array values BUG: Categoricals cannot compare to ndim=0 array values Oct 28, 2014
@jreback jreback added Bug Categorical Categorical Data Type labels Oct 28, 2014
@immerrr
Copy link
Contributor Author

immerrr commented Oct 29, 2014

There's an inconsistency between comparison and arithmetical operations in numpy, I've reported it here.

@jreback
Copy link
Contributor

jreback commented Oct 29, 2014

a numpy bug - say it's not so :)

@immerrr
Copy link
Contributor Author

immerrr commented Oct 29, 2014

Sure looks like it to me.

Regardless though, I think rank-0 arrays should be treated the same as their pure scalar counterparts. Is there a reason not to do that?

@jreback
Copy link
Contributor

jreback commented Oct 29, 2014

that sounds right - not sure we can fix his without the upstream fix though ?

@immerrr
Copy link
Contributor Author

immerrr commented Oct 29, 2014

Well, in numpy terms, scalar and zerodim array are different entities with their own properties. There were discussions about removing one of them in favour of the other, but AFAIC none has come to a solid decision to do so (here is an interesting summary).

My fix thoughts were about the fact that there are de facto two ways of expressing a scalar value in numpy and there may be places in pandas where only one of those is handled properly. Pandas has a long history of fixing numpy's shortcomings, maybe this case should be another manifestation of "doing it right" principle.

@jreback
Copy link
Contributor

jreback commented Oct 29, 2014

I agree with you in principle about an pd.isscalar - go for it if u have time

would make indexing logic detection maybe a bit simpler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants