Skip to content

Commit 9d4be70

Browse files
committed
Merge remote-tracking branch 'upstream/master' into argsort_labelling_index_sorted
2 parents 2d92c34 + 1ff6970 commit 9d4be70

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+1053
-435
lines changed

doc/source/reference/io.rst

Lines changed: 47 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ Pickling
1313
:toctree: api/
1414

1515
read_pickle
16+
DataFrame.to_pickle
1617

1718
Flat file
1819
~~~~~~~~~
@@ -21,6 +22,7 @@ Flat file
2122

2223
read_table
2324
read_csv
25+
DataFrame.to_csv
2426
read_fwf
2527

2628
Clipboard
@@ -29,30 +31,41 @@ Clipboard
2931
:toctree: api/
3032

3133
read_clipboard
34+
DataFrame.to_clipboard
3235

3336
Excel
3437
~~~~~
3538
.. autosummary::
3639
:toctree: api/
3740

3841
read_excel
42+
DataFrame.to_excel
3943
ExcelFile.parse
4044

45+
.. currentmodule:: pandas.io.formats.style
46+
47+
.. autosummary::
48+
:toctree: api/
49+
50+
Styler.to_excel
51+
52+
.. currentmodule:: pandas
53+
4154
.. autosummary::
4255
:toctree: api/
4356
:template: autosummary/class_without_autosummary.rst
4457

4558
ExcelWriter
4659

60+
.. currentmodule:: pandas.io.json
61+
4762
JSON
4863
~~~~
4964
.. autosummary::
5065
:toctree: api/
5166

5267
read_json
53-
json_normalize
54-
55-
.. currentmodule:: pandas.io.json
68+
to_json
5669

5770
.. autosummary::
5871
:toctree: api/
@@ -67,13 +80,40 @@ HTML
6780
:toctree: api/
6881

6982
read_html
83+
DataFrame.to_html
84+
85+
.. currentmodule:: pandas.io.formats.style
86+
87+
.. autosummary::
88+
:toctree: api/
89+
90+
Styler.to_html
91+
92+
.. currentmodule:: pandas
7093

7194
XML
7295
~~~~
7396
.. autosummary::
7497
:toctree: api/
7598

7699
read_xml
100+
DataFrame.to_xml
101+
102+
Latex
103+
~~~~~
104+
.. autosummary::
105+
:toctree: api/
106+
107+
DataFrame.to_latex
108+
109+
.. currentmodule:: pandas.io.formats.style
110+
111+
.. autosummary::
112+
:toctree: api/
113+
114+
Styler.to_latex
115+
116+
.. currentmodule:: pandas
77117

78118
HDFStore: PyTables (HDF5)
79119
~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -101,13 +141,15 @@ Feather
101141
:toctree: api/
102142

103143
read_feather
144+
DataFrame.to_feather
104145

105146
Parquet
106147
~~~~~~~
107148
.. autosummary::
108149
:toctree: api/
109150

110151
read_parquet
152+
DataFrame.to_parquet
111153

112154
ORC
113155
~~~
@@ -138,6 +180,7 @@ SQL
138180
read_sql_table
139181
read_sql_query
140182
read_sql
183+
DataFrame.to_sql
141184

142185
Google BigQuery
143186
~~~~~~~~~~~~~~~
@@ -152,6 +195,7 @@ STATA
152195
:toctree: api/
153196

154197
read_stata
198+
DataFrame.to_stata
155199

156200
.. currentmodule:: pandas.io.stata
157201

doc/source/reference/style.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,11 +35,14 @@ Style application
3535
Styler.apply
3636
Styler.applymap
3737
Styler.format
38+
Styler.hide_index
39+
Styler.hide_columns
3840
Styler.set_td_classes
3941
Styler.set_table_styles
4042
Styler.set_table_attributes
4143
Styler.set_tooltips
4244
Styler.set_caption
45+
Styler.set_sticky
4346
Styler.set_properties
4447
Styler.set_uuid
4548
Styler.clear

doc/source/user_guide/style.ipynb

Lines changed: 26 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,7 @@
152152
"\n",
153153
"Before adding styles it is useful to show that the [Styler][styler] can distinguish the *display* value from the *actual* value. To control the display value, the text is printed in each cell, and we can use the [.format()][formatfunc] method to manipulate this according to a [format spec string][format] or a callable that takes a single value and returns a string. It is possible to define this for the whole table or for individual columns. \n",
154154
"\n",
155-
"Additionally, the format function has a **precision** argument to specifically help formatting floats, an **na_rep** argument to display missing data, and an **escape** argument to help displaying safe-HTML. The default formatter is configured to adopt pandas' regular `display.precision` option, controllable using `with pd.option_context('display.precision', 2):`\n",
155+
"Additionally, the format function has a **precision** argument to specifically help formatting floats, as well as **decimal** and **thousands** separators to support other locales, an **na_rep** argument to display missing data, and an **escape** argument to help displaying safe-HTML or safe-LaTeX. The default formatter is configured to adopt pandas' regular `display.precision` option, controllable using `with pd.option_context('display.precision', 2):`\n",
156156
"\n",
157157
"Here is an example of using the multiple options to control the formatting generally and with specific column formatters.\n",
158158
"\n",
@@ -167,9 +167,9 @@
167167
"metadata": {},
168168
"outputs": [],
169169
"source": [
170-
"df.style.format(precision=0, na_rep='MISSING', \n",
170+
"df.style.format(precision=0, na_rep='MISSING', thousands=\" \",\n",
171171
" formatter={('Decision Tree', 'Tumour'): \"{:.2f}\",\n",
172-
" ('Regression', 'Non-Tumour'): lambda x: \"$ {:,.1f}\".format(x*-1e3)\n",
172+
" ('Regression', 'Non-Tumour'): lambda x: \"$ {:,.1f}\".format(x*-1e6)\n",
173173
" })"
174174
]
175175
},
@@ -179,9 +179,11 @@
179179
"source": [
180180
"### Hiding Data\n",
181181
"\n",
182-
"The index can be hidden from rendering by calling [.hide_index()][hideidx], which might be useful if your index is integer based.\n",
182+
"The index and column headers can be completely hidden, as well subselecting rows or columns that one wishes to exclude. Both these options are performed using the same methods.\n",
183183
"\n",
184-
"Columns can be hidden from rendering by calling [.hide_columns()][hidecols] and passing in the name of a column, or a slice of columns.\n",
184+
"The index can be hidden from rendering by calling [.hide_index()][hideidx] without any arguments, which might be useful if your index is integer based. Similarly column headers can be hidden by calling [.hide_columns()][hidecols] without any arguments.\n",
185+
"\n",
186+
"Specific rows or columns can be hidden from rendering by calling the same [.hide_index()][hideidx] or [.hide_columns()][hidecols] methods and passing in a row/column label, a list-like or a slice of row/column labels to for the ``subset`` argument.\n",
185187
"\n",
186188
"Hiding does not change the integer arrangement of CSS classes, e.g. hiding the first two columns of a DataFrame means the column class indexing will start at `col2`, since `col0` and `col1` are simply ignored.\n",
187189
"\n",
@@ -1403,7 +1405,9 @@
14031405
"source": [
14041406
"### Sticky Headers\n",
14051407
"\n",
1406-
"If you display a large matrix or DataFrame in a notebook, but you want to always see the column and row headers you can use the following CSS to make them stick. We might make this into an API function later."
1408+
"If you display a large matrix or DataFrame in a notebook, but you want to always see the column and row headers you can use the [.set_sticky][sticky] method which manipulates the table styles CSS.\n",
1409+
"\n",
1410+
"[sticky]: ../reference/api/pandas.io.formats.style.Styler.set_sticky.rst"
14071411
]
14081412
},
14091413
{
@@ -1412,20 +1416,15 @@
14121416
"metadata": {},
14131417
"outputs": [],
14141418
"source": [
1415-
"bigdf = pd.DataFrame(np.random.randn(15, 100))\n",
1416-
"bigdf.style.set_table_styles([\n",
1417-
" {'selector': 'thead th', 'props': 'position: sticky; top:0; background-color:salmon;'},\n",
1418-
" {'selector': 'tbody th', 'props': 'position: sticky; left:0; background-color:lightgreen;'} \n",
1419-
"])"
1419+
"bigdf = pd.DataFrame(np.random.randn(16, 100))\n",
1420+
"bigdf.style.set_sticky(axis=\"index\")"
14201421
]
14211422
},
14221423
{
14231424
"cell_type": "markdown",
14241425
"metadata": {},
14251426
"source": [
1426-
"### Hiding Headers\n",
1427-
"\n",
1428-
"We don't yet have any API to hide headers so a quick fix is:"
1427+
"It is also possible to stick MultiIndexes and even only specific levels."
14291428
]
14301429
},
14311430
{
@@ -1434,7 +1433,8 @@
14341433
"metadata": {},
14351434
"outputs": [],
14361435
"source": [
1437-
"df3.style.set_table_styles([{'selector': 'thead tr', 'props': 'display: none;'}]) # or 'thead th'"
1436+
"bigdf.index = pd.MultiIndex.from_product([[\"A\",\"B\"],[0,1],[0,1,2,3]])\n",
1437+
"bigdf.style.set_sticky(axis=\"index\", pixel_size=18, levels=[1,2])"
14381438
]
14391439
},
14401440
{
@@ -1524,6 +1524,17 @@
15241524
"![Excel spreadsheet with styled DataFrame](../_static/style-excel.png)\n"
15251525
]
15261526
},
1527+
{
1528+
"cell_type": "markdown",
1529+
"metadata": {},
1530+
"source": [
1531+
"## Export to LaTeX\n",
1532+
"\n",
1533+
"There is support (*since version 1.3.0*) to export `Styler` to LaTeX. The documentation for the [.to_latex][latex] method gives further detail and numerous examples.\n",
1534+
"\n",
1535+
"[latex]: ../reference/api/pandas.io.formats.style.Styler.to_latex.rst"
1536+
]
1537+
},
15271538
{
15281539
"cell_type": "markdown",
15291540
"metadata": {},

doc/source/whatsnew/v1.3.0.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,7 @@ which has been revised and improved (:issue:`39720`, :issue:`39317`, :issue:`404
138138
- Added the option ``styler.render.max_elements`` to avoid browser overload when styling large DataFrames (:issue:`40712`)
139139
- Added the method :meth:`.Styler.to_latex` (:issue:`21673`), which also allows some limited CSS conversion (:issue:`40731`)
140140
- Added the method :meth:`.Styler.to_html` (:issue:`13379`)
141+
- Added the method :meth:`.Styler.set_sticky` to make index and column headers permanently visible in scrolling HTML frames (:issue:`29072`)
141142

142143
.. _whatsnew_130.enhancements.dataframe_honors_copy_with_dict:
143144

@@ -986,6 +987,7 @@ Indexing
986987
^^^^^^^^
987988
- Bug in :meth:`Index.union` and :meth:`MultiIndex.union` dropping duplicate ``Index`` values when ``Index`` was not monotonic or ``sort`` was set to ``False`` (:issue:`36289`, :issue:`31326`, :issue:`40862`)
988989
- Bug in :meth:`CategoricalIndex.get_indexer` failing to raise ``InvalidIndexError`` when non-unique (:issue:`38372`)
990+
- Bug in :meth:`IntervalIndex.get_indexer` when ``target`` has ``CategoricalDtype`` and both the index and the target contain NA values (:issue:`41934`)
989991
- Bug in :meth:`Series.loc` raising a ``ValueError`` when input was filtered with a Boolean list and values to set were a list with lower dimension (:issue:`20438`)
990992
- Bug in inserting many new columns into a :class:`DataFrame` causing incorrect subsequent indexing behavior (:issue:`38380`)
991993
- Bug in :meth:`DataFrame.__setitem__` raising a ``ValueError`` when setting multiple values to duplicate columns (:issue:`15695`)
@@ -1025,6 +1027,7 @@ Indexing
10251027
- Bug in :meth:`PeriodIndex.get_loc` failing to raise a ``KeyError`` when given a :class:`Period` with a mismatched ``freq`` (:issue:`41670`)
10261028
- Bug ``.loc.__getitem__`` with a :class:`UInt64Index` and negative-integer keys raising ``OverflowError`` instead of ``KeyError`` in some cases, wrapping around to positive integers in others (:issue:`41777`)
10271029
- Bug in :meth:`Index.get_indexer` failing to raise ``ValueError`` in some cases with invalid ``method``, ``limit``, or ``tolerance`` arguments (:issue:`41918`)
1030+
- Bug when slicing a :class:`Series` or :class:`DataFrame` with a :class:`TimedeltaIndex` when passing an invalid string raising ``ValueError`` instead of a ``TypeError`` (:issue:`41821`)
10281031

10291032
Missing
10301033
^^^^^^^
@@ -1033,13 +1036,15 @@ Missing
10331036
- Bug in :meth:`DataFrame.fillna` not accepting a dictionary for the ``downcast`` keyword (:issue:`40809`)
10341037
- Bug in :func:`isna` not returning a copy of the mask for nullable types, causing any subsequent mask modification to change the original array (:issue:`40935`)
10351038
- Bug in :class:`DataFrame` construction with float data containing ``NaN`` and an integer ``dtype`` casting instead of retaining the ``NaN`` (:issue:`26919`)
1039+
- Bug in :meth:`Series.isin` and :meth:`MultiIndex.isin` didn't treat all nans as equivalent if they were in tuples (:issue:`41836`)
10361040

10371041
MultiIndex
10381042
^^^^^^^^^^
10391043
- Bug in :meth:`DataFrame.drop` raising a ``TypeError`` when the :class:`MultiIndex` is non-unique and ``level`` is not provided (:issue:`36293`)
10401044
- Bug in :meth:`MultiIndex.intersection` duplicating ``NaN`` in the result (:issue:`38623`)
10411045
- Bug in :meth:`MultiIndex.equals` incorrectly returning ``True`` when the :class:`MultiIndex` contained ``NaN`` even when they are differently ordered (:issue:`38439`)
10421046
- Bug in :meth:`MultiIndex.intersection` always returning an empty result when intersecting with :class:`CategoricalIndex` (:issue:`38653`)
1047+
- Bug in :meth:`MultiIndex.difference` incorrectly raising ``TypeError`` when indexes contain non-sortable entries (:issue:`41915`)
10431048
- Bug in :meth:`MultiIndex.reindex` raising a ``ValueError`` when used on an empty :class:`MultiIndex` and indexing only a specific level (:issue:`41170`)
10441049
- Bug in :meth:`MultiIndex.reindex` raising ``TypeError`` when reindexing against a flat :class:`Index` (:issue:`41707`)
10451050

@@ -1079,6 +1084,7 @@ I/O
10791084
- Bug in the conversion from PyArrow to pandas (e.g. for reading Parquet) with nullable dtypes and a PyArrow array whose data buffer size is not a multiple of the dtype size (:issue:`40896`)
10801085
- Bug in :func:`read_excel` would raise an error when pandas could not determine the file type even though the user specified the ``engine`` argument (:issue:`41225`)
10811086
- Bug in :func:`read_clipboard` copying from an excel file shifts values into the wrong column if there are null values in first column (:issue:`41108`)
1087+
- Bug in :meth:`DataFrame.to_hdf` and :meth:`Series.to_hdf` raising a ``TypeError`` when trying to append a string column to an incompatible column (:issue:`41897`)
10821088

10831089
Period
10841090
^^^^^^
@@ -1138,6 +1144,8 @@ Groupby/resample/rolling
11381144
- Bug in :class:`DataFrameGroupBy` aggregations incorrectly failing to drop columns with invalid dtypes for that aggregation when there are no valid columns (:issue:`41291`)
11391145
- Bug in :meth:`DataFrame.rolling.__iter__` where ``on`` was not assigned to the index of the resulting objects (:issue:`40373`)
11401146
- Bug in :meth:`.DataFrameGroupBy.transform` and :meth:`.DataFrameGroupBy.agg` with ``engine="numba"`` where ``*args`` were being cached with the user passed function (:issue:`41647`)
1147+
- Bug in :class:`DataFrameGroupBy` methods ``agg``, ``transform``, ``sum``, ``bfill``, ``ffill``, ``pad``, ``pct_change``, ``shift``, ``ohlc`` dropping ``.columns.names`` (:issue:`41497`)
1148+
11411149

11421150
Reshaping
11431151
^^^^^^^^^
@@ -1160,6 +1168,7 @@ Reshaping
11601168
- Bug in :func:`to_datetime` raising an error when the input sequence contained unhashable items (:issue:`39756`)
11611169
- Bug in :meth:`Series.explode` preserving the index when ``ignore_index`` was ``True`` and values were scalars (:issue:`40487`)
11621170
- Bug in :func:`to_datetime` raising a ``ValueError`` when :class:`Series` contains ``None`` and ``NaT`` and has more than 50 elements (:issue:`39882`)
1171+
- Bug in :meth:`Series.unstack` and :meth:`DataFrame.unstack` with object-dtype values containing timezone-aware datetime objects incorrectly raising ``TypeError`` (:issue:`41875`)
11631172
- Bug in :meth:`DataFrame.melt` raising ``InvalidIndexError`` when :class:`DataFrame` has duplicate columns used as ``value_vars`` (:issue:`41951`)
11641173

11651174
Sparse
@@ -1206,6 +1215,7 @@ Other
12061215
- Bug in :class:`Series` backed by :class:`DatetimeArray` or :class:`TimedeltaArray` sometimes failing to set the array's ``freq`` to ``None`` (:issue:`41425`)
12071216
- Bug in creating a :class:`Series` from a ``range`` object that does not fit in the bounds of ``int64`` dtype (:issue:`30173`)
12081217
- Bug in creating a :class:`Series` from a ``dict`` with all-tuple keys and an :class:`Index` that requires reindexing (:issue:`41707`)
1218+
- Bug in :func:`pandas.util.hash_pandas_object` not recognizing ``hash_key``, ``encoding`` and ``categorize`` when the input object type is a :class:`DataFrame` (:issue:`41404`)
12091219

12101220
.. ---------------------------------------------------------------------------
12111221

doc/source/whatsnew/v1.4.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -167,7 +167,7 @@ Missing
167167

168168
MultiIndex
169169
^^^^^^^^^^
170-
-
170+
- Bug in :meth:`MultiIndex.reindex` when passing a ``level`` that corresponds to an ``ExtensionDtype`` level (:issue:`42043`)
171171
-
172172

173173
I/O

0 commit comments

Comments
 (0)