-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Inconsistency, NaT included in result of groupby method first but not NaN #10590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
hmm, @sinhrks I thought this was fixed? |
I remember some issues when |
prob needs some adjustment for comparison vs iNaT in the first/last (though i thought it was there) |
Confirmed
|
@sinhrks :) I see. Well, I have started to research GitHub and the Pandas source code a bit but right now I unfortunately don't have very much time available and waiting for me could require patience. |
For groupby the time stamps gets converted to integervalue tslib.iNaT which is -9223372036854775808. The aggregation is then done using this value with incorrect result as a consequence. The solution proposed here is to replace its value by np.nan in case it is a datetime or timedelta.
For groupby the time stamps gets converted to integervalue tslib.iNaT which is -9223372036854775808. The aggregation is then done using this value with incorrect result as a consequence. The solution proposed here is to replace its value by np.nan in case it is a datetime or timedelta.
For groupby the time stamps gets converted to integervalue tslib.iNaT which is -9223372036854775808. The aggregation is then done using this value with incorrect result as a consequence. The solution proposed here is to replace its value by np.nan in case it is a datetime or timedelta.
NaT
is included in result ofgroupby
methodfirst
whileNaN
. I am expecting that first should skip bothNaN
andNaT
and include the first value wherepandas.isnull
is False.Demonstration of the inconsistency. (note that both
NaT
andNaN
in the data frame are produced bynp.nan
, the difference is that the d_t column contains date values).Resulting data frame:
Grouping this data frame on the
IX
column and executing thefirst
method results in this data frame which shows the inconsistency between thed_t
andnum
columns.Resulting dataframe:
The text was updated successfully, but these errors were encountered: