docs: document inlining of small data in read_* APIs (#670)

shobsi · web-flow · commit 306953aaae69 · 2024-05-08T15:58:29.000-07:00
* docs: document inlining of small data in `read_*` APIs

* mention that threshold is in memory size

* non-bigquery instead of non-"bigquery"
diff --git a/bigframes/session/__init__.py b/bigframes/session/__init__.py
@@ -874,6 +874,11 @@ def read_pandas(
         The pandas DataFrame will be persisted as a temporary BigQuery table, which can be
         automatically recycled after the Session is closed.
 
+        .. note::
+            Data is inlined in the query SQL if it is small enough (roughly 5MB
+            or less in memory). Larger size data is loaded to a BigQuery table
+            instead.
+
         **Examples:**
 
             >>> import bigframes.pandas as bpd
diff --git a/third_party/bigframes_vendored/pandas/io/parquet.py b/third_party/bigframes_vendored/pandas/io/parquet.py
@@ -19,6 +19,11 @@ def read_parquet(
             Instead, set a serialized index column as the index and sort by
             that in the resulting DataFrame.
 
+        .. note::
+            For non-"bigquery" engine, data is inlined in the query SQL if it is
+            small enough (roughly 5MB or less in memory). Larger size data is
+            loaded to a BigQuery table instead.
+
         **Examples:**
 
             >>> import bigframes.pandas as bpd
diff --git a/third_party/bigframes_vendored/pandas/io/parsers/readers.py b/third_party/bigframes_vendored/pandas/io/parsers/readers.py
@@ -62,6 +62,11 @@ def read_csv(
             file. Instead, set a serialized index column as the index and sort by
             that in the resulting DataFrame.
 
+        .. note::
+            For non-bigquery engine, data is inlined in the query SQL if it is
+            small enough (roughly 5MB or less in memory). Larger size data is
+            loaded to a BigQuery table instead.
+
         **Examples:**
 
             >>> import bigframes.pandas as bpd
@@ -167,6 +172,11 @@ def read_json(
             file. Instead, set a serialized index column as the index and sort by
             that in the resulting DataFrame.
 
+        .. note::
+            For non-bigquery engine, data is inlined in the query SQL if it is
+            small enough (roughly 5MB or less in memory). Larger size data is
+            loaded to a BigQuery table instead.
+
         **Examples:**
 
             >>> import bigframes.pandas as bpd
diff --git a/third_party/bigframes_vendored/pandas/io/pickle.py b/third_party/bigframes_vendored/pandas/io/pickle.py
@@ -25,6 +25,11 @@ def read_pickle(
             If the content of the pickle file is a Series and its name attribute is None,
             the name will be set to '0' by default.
 
+        .. note::
+            Data is inlined in the query SQL if it is small enough (roughly 5MB
+            or less in memory). Larger size data is loaded to a BigQuery table
+            instead.
+
         **Examples:**
 
             >>> import bigframes.pandas as bpd