Skip to content

ERR: stat function kwarg interpretation #12301

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mikeyshulman opened this issue Feb 11, 2016 · 13 comments
Closed

ERR: stat function kwarg interpretation #12301

mikeyshulman opened this issue Feb 11, 2016 · 13 comments
Labels
API Design Error Reporting Incorrect or improved errors from pandas
Milestone

Comments

@mikeyshulman
Copy link

df.max(axi=1) does not throw an error and evaluates to df.max()

@jreback
Copy link
Contributor

jreback commented Feb 12, 2016

these functions accept **kwargs in order to avoid us having to name certain numpy arguments (e.g. out), which are not generally acceptable to pandas. I suppose that we could check if any but certain args are there and raise a nice error message.

As I don't want to have out=None in the signature nor the doc-string.

pull-requests would be welcome

@jreback jreback changed the title max kwarg interpretation stat function kwarg interpretation Feb 12, 2016
@jreback jreback changed the title stat function kwarg interpretation ERR: stat function kwarg interpretation Feb 12, 2016
@jreback jreback added API Design Error Reporting Incorrect or improved errors from pandas Difficulty Intermediate labels Feb 12, 2016
@jreback jreback added this to the Next Major Release milestone Feb 12, 2016
@gfyoung
Copy link
Member

gfyoung commented Feb 13, 2016

I'm not entirely sure how all of these functions are set up, but when I tried taking the **kwargs argument out of just one of the method signatures here, it breaks a ton of tests (about 40). I keep on getting TypeError: stat_func() got an unexpected keyword argument 'out', which trace back to function calls in fromnumeric.py in numpy such as amax(axis=axis, out=out)

IINM it seems that this tolerance for "invalid" arguments is quite ingrained into the codebase, so much so that even the tests seem to allow it.

@gfyoung
Copy link
Member

gfyoung commented Feb 13, 2016

Unless someone can explain otherwise, it seems that the best that can be done is to check whether kwargs is non-empty, after which you can raise a warning like: UserWarning: invalid arguments passed into {insert function name}. Please pass in only arguments listed in the function signature

@jreback
Copy link
Contributor

jreback commented Feb 13, 2016

what u can do is pop out from other kwargs
then if it's not empty raise

iirc out is the only arg that we accept that's not listed (and we don't want it) but allow for compat

@gfyoung
Copy link
Member

gfyoung commented Feb 13, 2016

How about just accepting any arguments that currently break the tests? out is not the only one. dtype is another parameter that breaks at least one test.

@jreback
Copy link
Contributor

jreback commented Feb 13, 2016

yeah that too

@jreback
Copy link
Contributor

jreback commented Feb 13, 2016

the reason iirc I did this in the first place because of the API changes in various versions of numpy and didnt want to have to deal with strict checking in case something was added

@jreback
Copy link
Contributor

jreback commented Feb 13, 2016

I am ok with fine grained checking
I just don't really want them in the signature

@gfyoung
Copy link
Member

gfyoung commented Feb 13, 2016

That's fair. It's too bad that there are so many naming conflicts between the numpy and pandas codebase, for the **kwargs is not even used in the functions, only for compatibility between the libraries. Should **kwargs be included in documentation? I did not realize that this was the reasoning initially.

@jreback
Copy link
Contributor

jreback commented Feb 13, 2016

not sure what u mean by naming conflicts

@gfyoung
Copy link
Member

gfyoung commented Feb 13, 2016

What I meant by "conflict" was that the trace back always kept originating to function calls in the numpy library that evidently must have also been defined in pandas IIUC.

@jreback
Copy link
Contributor

jreback commented Feb 13, 2016

btw we first try to dispatch to bottleneck (if installed), then numpy. things are tested with both.

@jreback jreback modified the milestones: 0.18.0, Next Major Release Feb 13, 2016
gfyoung added a commit to forking-repos/pandas that referenced this issue Feb 14, 2016
Filters kwargs argument in stat functions to
prevent the passage of clearly invalid arguments
while at the same time maintaining compatibility
with analogous numpy functions.

Closes pandas-devgh-12301.
jreback pushed a commit that referenced this issue Feb 14, 2016
Addresses issue #12301 by filtering `kwargs` argument in stat
functions to prevent the passage of clearly invalid arguments while at
the same time maintaining compatibility with analogous `numpy`
functions.

Author: gfyoung <[email protected]>

Closes #12318 from gfyoung/kwarg_remover and squashes the following commits:

f9de80f [gfyoung] BUG: Prevent abuse of kwargs in stat functions
@jreback
Copy link
Contributor

jreback commented Feb 14, 2016

closed by #12318

@jreback jreback closed this as completed Feb 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Error Reporting Incorrect or improved errors from pandas
Projects
None yet
Development

No branches or pull requests

3 participants