Skip to content

Commit 3c2cb0f

Browse files
committed
Merge remote-tracking branch 'upstream/master' into ea-sparse-2
2 parents 0a37050 + 7725fa0 commit 3c2cb0f

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

54 files changed

+608
-472
lines changed

asv_bench/benchmarks/reshape.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -141,7 +141,7 @@ class GetDummies(object):
141141

142142
def setup(self):
143143
categories = list(string.ascii_letters[:12])
144-
s = pd.Series(np.random.choice(categories, size=1_000_000),
144+
s = pd.Series(np.random.choice(categories, size=1000000),
145145
dtype=pd.api.types.CategoricalDtype(categories))
146146
self.s = s
147147

doc/source/io.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -298,7 +298,7 @@ compression : {``'infer'``, ``'gzip'``, ``'bz2'``, ``'zip'``, ``'xz'``, ``None``
298298
Set to ``None`` for no decompression.
299299

300300
.. versionadded:: 0.18.1 support for 'zip' and 'xz' compression.
301-
301+
.. versionchanged:: 0.24.0 'infer' option added and set to default.
302302
thousands : str, default ``None``
303303
Thousands separator.
304304
decimal : str, default ``'.'``

doc/source/whatsnew/v0.23.4.txt

Lines changed: 2 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
.. _whatsnew_0234:
22

3-
v0.23.4
4-
-------
3+
v0.23.4 (August 3, 2018)
4+
------------------------
55

66
This is a minor bug-fix release in the 0.23.x series and includes some small regression fixes
77
and bug fixes. We recommend that all users upgrade to this version.
@@ -21,7 +21,6 @@ Fixed Regressions
2121
~~~~~~~~~~~~~~~~~
2222

2323
- Python 3.7 with Windows gave all missing values for rolling variance calculations (:issue:`21813`)
24-
-
2524

2625
.. _whatsnew_0234.bug_fixes:
2726

@@ -32,37 +31,6 @@ Bug Fixes
3231

3332
- Bug where calling :func:`DataFrameGroupBy.agg` with a list of functions including ``ohlc`` as the non-initial element would raise a ``ValueError`` (:issue:`21716`)
3433
- Bug in ``roll_quantile`` caused a memory leak when calling ``.rolling(...).quantile(q)`` with ``q`` in (0,1) (:issue:`21965`)
35-
-
36-
37-
**Conversion**
38-
39-
-
40-
-
41-
42-
**Indexing**
43-
44-
-
45-
-
46-
47-
**I/O**
48-
49-
-
50-
-
51-
52-
**Categorical**
53-
54-
-
55-
-
56-
57-
**Timezones**
58-
59-
-
60-
-
61-
62-
**Timedelta**
63-
64-
-
65-
-
6634

6735
**Missing**
6836

doc/source/whatsnew/v0.24.0.txt

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -177,7 +177,8 @@ Other Enhancements
177177
- :func:`read_html` copies cell data across ``colspan`` and ``rowspan``, and it treats all-``th`` table rows as headers if ``header`` kwarg is not given and there is no ``thead`` (:issue:`17054`)
178178
- :meth:`Series.nlargest`, :meth:`Series.nsmallest`, :meth:`DataFrame.nlargest`, and :meth:`DataFrame.nsmallest` now accept the value ``"all"`` for the ``keep`` argument. This keeps all ties for the nth largest/smallest value (:issue:`16818`)
179179
- :class:`IntervalIndex` has gained the :meth:`~IntervalIndex.set_closed` method to change the existing ``closed`` value (:issue:`21670`)
180-
- :func:`~DataFrame.to_csv` and :func:`~DataFrame.to_json` now support ``compression='infer'`` to infer compression based on filename (:issue:`15008`)
180+
- :func:`~DataFrame.to_csv`, :func:`~Series.to_csv`, :func:`~DataFrame.to_json`, and :func:`~Series.to_json` now support ``compression='infer'`` to infer compression based on filename extension (:issue:`15008`).
181+
The default compression for ``to_csv``, ``to_json``, and ``to_pickle`` methods has been updated to ``'infer'`` (:issue:`22004`).
181182
- :func:`to_timedelta` now supports iso-formated timedelta strings (:issue:`21877`)
182183
- :class:`Series` and :class:`DataFrame` now support :class:`Iterable` in constructor (:issue:`2193`)
183184

@@ -488,6 +489,8 @@ Deprecations
488489
- :meth:`MultiIndex.to_hierarchical` is deprecated and will be removed in a future version (:issue:`21613`)
489490
- :meth:`Series.ptp` is deprecated. Use ``numpy.ptp`` instead (:issue:`21614`)
490491
- :meth:`Series.compress` is deprecated. Use ``Series[condition]`` instead (:issue:`18262`)
492+
- :meth:`Categorical.from_codes` has deprecated providing float values for the ``codes`` argument. (:issue:`21767`)
493+
- :func:`pandas.read_table` is deprecated. Instead, use :func:`pandas.read_csv` passing ``sep='\t'`` if necessary (:issue:`21948`)
491494

492495
.. _whatsnew_0240.prior_deprecations:
493496

@@ -496,7 +499,7 @@ Removal of prior version deprecations/changes
496499

497500
- The ``LongPanel`` and ``WidePanel`` classes have been removed (:issue:`10892`)
498501
- Several private functions were removed from the (non-public) module ``pandas.core.common`` (:issue:`22001`)
499-
-
502+
- Removal of the previously deprecated module ``pandas.core.datetools`` (:issue:`14105`, :issue:`14094`)
500503
-
501504

502505
.. _whatsnew_0240.performance:
@@ -535,9 +538,7 @@ Bug Fixes
535538
Categorical
536539
^^^^^^^^^^^
537540

538-
-
539-
-
540-
-
541+
- Bug in :meth:`Categorical.from_codes` where ``NaN`` values in `codes` were silently converted to ``0`` (:issue:`21767`). In the future this will raise a ``ValueError``. Also changes the behavior of `.from_codes([1.1, 2.0])`.
541542

542543
Datetimelike
543544
^^^^^^^^^^^^
@@ -571,6 +572,7 @@ Timezones
571572
- Bug in :class:`DatetimeIndex` where constructing with an integer and tz would not localize correctly (:issue:`12619`)
572573
- Fixed bug where :meth:`DataFrame.describe` and :meth:`Series.describe` on tz-aware datetimes did not show `first` and `last` result (:issue:`21328`)
573574
- Bug in :class:`DatetimeIndex` comparisons failing to raise ``TypeError`` when comparing timezone-aware ``DatetimeIndex`` against ``np.datetime64`` (:issue:`22074`)
575+
- Bug in ``DataFrame`` assignment with a timezone-aware scalar (:issue:`19843`)
574576

575577
Offsets
576578
^^^^^^^
@@ -641,8 +643,8 @@ I/O
641643
Plotting
642644
^^^^^^^^
643645

644-
- Bug in :func:'DataFrame.plot.scatter' and :func:'DataFrame.plot.hexbin' caused x-axis label and ticklabels to disappear when colorbar was on in IPython inline backend (:issue:`10611`, :issue:`10678`, and :issue:`20455`)
645-
-
646+
- Bug in :func:`DataFrame.plot.scatter` and :func:`DataFrame.plot.hexbin` caused x-axis label and ticklabels to disappear when colorbar was on in IPython inline backend (:issue:`10611`, :issue:`10678`, and :issue:`20455`)
647+
- Bug in plotting a Series with datetimes using :func:`matplotlib.axes.Axes.scatter` (:issue:`22039`)
646648

647649
Groupby/Resample/Rolling
648650
^^^^^^^^^^^^^^^^^^^^^^^^
@@ -670,6 +672,8 @@ Reshaping
670672
- Bug in :meth:`Series.where` and :meth:`DataFrame.where` with ``datetime64[ns, tz]`` dtype (:issue:`21546`)
671673
- Bug in :meth:`Series.mask` and :meth:`DataFrame.mask` with ``list`` conditionals (:issue:`21891`)
672674
- Bug in :meth:`DataFrame.replace` raises RecursionError when converting OutOfBounds ``datetime64[ns, tz]`` (:issue:`20380`)
675+
- :func:`pandas.core.groupby.GroupBy.rank` now raises a ``ValueError`` when an invalid value is passed for argument ``na_option`` (:issue:`22124`)
676+
- Bug in :func:`get_dummies` with Unicode attributes in Python 2 (:issue:`22084`)
673677
-
674678

675679
Build Changes

pandas/_libs/hashing.pyx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
import cython
66

77
import numpy as np
8-
from numpy cimport ndarray, uint8_t, uint32_t, uint64_t
8+
from numpy cimport uint8_t, uint32_t, uint64_t
99

1010
from util cimport _checknull
1111
from cpython cimport (PyBytes_Check,
@@ -17,7 +17,7 @@ DEF dROUNDS = 4
1717

1818

1919
@cython.boundscheck(False)
20-
def hash_object_array(ndarray[object] arr, object key, object encoding='utf8'):
20+
def hash_object_array(object[:] arr, object key, object encoding='utf8'):
2121
"""
2222
Parameters
2323
----------
@@ -37,7 +37,7 @@ def hash_object_array(ndarray[object] arr, object key, object encoding='utf8'):
3737
"""
3838
cdef:
3939
Py_ssize_t i, l, n
40-
ndarray[uint64_t] result
40+
uint64_t[:] result
4141
bytes data, k
4242
uint8_t *kb
4343
uint64_t *lens
@@ -89,7 +89,7 @@ def hash_object_array(ndarray[object] arr, object key, object encoding='utf8'):
8989

9090
free(vecs)
9191
free(lens)
92-
return result
92+
return result.base # .base to retrieve underlying np.ndarray
9393

9494

9595
cdef inline uint64_t _rotl(uint64_t x, uint64_t b) nogil:

pandas/_libs/src/ujson/python/objToJSON.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,8 @@ Numeric decoder derived from from TCL library
4747
#include <numpy/npy_math.h> // NOLINT(build/include_order)
4848
#include <stdio.h> // NOLINT(build/include_order)
4949
#include <ultrajson.h> // NOLINT(build/include_order)
50-
#include <np_datetime.h> // NOLINT(build/include_order)
51-
#include <np_datetime_strings.h> // NOLINT(build/include_order)
50+
#include <../../../tslibs/src/datetime/np_datetime.h> // NOLINT(build/include_order)
51+
#include <../../../tslibs/src/datetime/np_datetime_strings.h> // NOLINT(build/include_order)
5252
#include "datetime.h"
5353

5454
static PyObject *type_decimal;

pandas/_libs/tslib.pyx

Lines changed: 9 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
# -*- coding: utf-8 -*-
22
# cython: profile=False
3-
cimport cython
43
from cython cimport Py_ssize_t
54

65
from cpython cimport PyFloat_Check, PyUnicode_Check
@@ -37,8 +36,7 @@ from tslibs.np_datetime import OutOfBoundsDatetime
3736
from tslibs.parsing import parse_datetime_string
3837

3938
from tslibs.timedeltas cimport cast_from_unit
40-
from tslibs.timezones cimport (is_utc, is_tzlocal, is_fixed_offset,
41-
treat_tz_as_pytz, get_dst_info)
39+
from tslibs.timezones cimport is_utc, is_tzlocal, get_dst_info
4240
from tslibs.conversion cimport (tz_convert_single, _TSObject,
4341
convert_datetime_to_tsobject,
4442
get_datetime64_nanos,
@@ -77,8 +75,7 @@ cdef inline object create_time_from_ts(
7775
return time(dts.hour, dts.min, dts.sec, dts.us)
7876

7977

80-
def ints_to_pydatetime(ndarray[int64_t] arr, tz=None, freq=None,
81-
box="datetime"):
78+
def ints_to_pydatetime(int64_t[:] arr, tz=None, freq=None, box="datetime"):
8279
"""
8380
Convert an i8 repr to an ndarray of datetimes, date, time or Timestamp
8481
@@ -102,7 +99,9 @@ def ints_to_pydatetime(ndarray[int64_t] arr, tz=None, freq=None,
10299

103100
cdef:
104101
Py_ssize_t i, n = len(arr)
105-
ndarray[int64_t] trans, deltas
102+
ndarray[int64_t] trans
103+
int64_t[:] deltas
104+
Py_ssize_t pos
106105
npy_datetimestruct dts
107106
object dt
108107
int64_t value, delta
@@ -635,24 +634,12 @@ cpdef array_to_datetime(ndarray[object] values, errors='raise',
635634

636635
# If the dateutil parser returned tzinfo, capture it
637636
# to check if all arguments have the same tzinfo
638-
tz = py_dt.tzinfo
637+
tz = py_dt.utcoffset()
639638
if tz is not None:
640639
seen_datetime_offset = 1
641-
if tz == dateutil_utc():
642-
# dateutil.tz.tzutc has no offset-like attribute
643-
# Just add the 0 offset explicitly
644-
out_tzoffset_vals.add(0)
645-
elif tz == tzlocal():
646-
# is comparison fails unlike other dateutil.tz
647-
# objects. Also, dateutil.tz.tzlocal has no
648-
# _offset attribute like tzoffset
649-
offset_seconds = tz._dst_offset.total_seconds()
650-
out_tzoffset_vals.add(offset_seconds)
651-
else:
652-
# dateutil.tz.tzoffset objects cannot be hashed
653-
# store the total_seconds() instead
654-
offset_seconds = tz._offset.total_seconds()
655-
out_tzoffset_vals.add(offset_seconds)
640+
# dateutil timezone objects cannot be hashed, so store
641+
# the UTC offsets in seconds instead
642+
out_tzoffset_vals.add(tz.total_seconds())
656643
else:
657644
# Add a marker for naive string, to track if we are
658645
# parsing mixed naive and aware strings

0 commit comments

Comments
 (0)