Skip to content

pd.merge fails on datetime columns with tzinfo #11405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
miraculixx opened this issue Oct 21, 2015 · 1 comment · Fixed by #11410
Closed

pd.merge fails on datetime columns with tzinfo #11405

miraculixx opened this issue Oct 21, 2015 · 1 comment · Fixed by #11410
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype
Milestone

Comments

@miraculixx
Copy link

Since pandas-0.17 a merge on a datetime column fails if the datetime is tz-aware, see example below. Possibly related to #9663?

import pandas as pd
from datetime import datetime
from dateutil.tz import gettz
import sys, os
import traceback as tbm
# works
a = pd.DataFrame({'created' : [datetime(2015,10,10), 
                               datetime(2015,10,20)], 
                  'count' : [1,2]})
b = pd.DataFrame({'created' : [datetime(2015,10,10), 
                               datetime(2015,10,20)], 
                  'count' : [1,2]})
pd.merge(a, b, how='outer')
# doesn't work (used to work on pandas-0.16.2)
try:
    utc = gettz('UTC')
    a = pd.DataFrame({'created' : [datetime(2015,10,10, tzinfo=utc), 
                                   datetime(2015,10,20, tzinfo=utc)], 
                      'count' : [1,2]})
    b = pd.DataFrame({'created' : [datetime(2015,10,10, tzinfo=utc), 
                                   datetime(2015,10,20, tzinfo=utc)], 
                      'count' : [1,2]})
    pd.merge(a, b, how='outer')
except Exception as e:
    print "Yeah, doesn't work: %s" % e   
    _, _, tb = sys.exc_info()
    stack = lambda n : tbm.extract_tb(tb, 99)[n][0:]
    print "called from", stack(0)
    print "failing statement", stack(-1)

=>

Yeah, doesn't work: type object argument after * must be a sequence, not itertools.imap
called from ('<ipython-input-194-3c3669b26a55>', 23, '<module>', u"pd.merge(a, b, how='outer')")
failing statement ('/.../local/lib/python2.7/site-packages/pandas/tools/merge.py', 516, '_get_join_indexers', 'llab, rlab, shape = map(list, zip( * map(fkeys, left_keys, right_keys)))')

the culprit seems to be in the call to _factorize_keys though I couldn't quite figure out what goes wrong.

Version info

$ python --version
Python 2.7.6
$ pip freeze | grep pandas
pandas==0.17.0
@jreback jreback added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype labels Oct 22, 2015
@jreback jreback added this to the 0.17.1 milestone Oct 22, 2015
@jreback
Copy link
Contributor

jreback commented Oct 22, 2015

nah, this was not implemented (with the new tz dtypes) and not tested, fixed in #11410

jreback added a commit to jreback/pandas that referenced this issue Oct 23, 2015
jreback added a commit that referenced this issue Oct 23, 2015
Bug in merging datetime64[ns, tz] dtypes #11405
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants