You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
chore: sync latest changes from internal repo (#7)
docs: highlight bigframes is open-source
docs: correct the return types of Dataframe and Series
docs: create subfolders for notebooks
feat: add `bigframes.get_global_session()` and `bigframes.reset_session()` aliases
chore: mark ml.llm tests flaky
chore: make kokoro/build.sh executable
feat: add `Series.str` methods `isalpha`, `isdigit`, `isdecimal`, `isalnum`, `isspace`, `islower`, `isupper`, `zfill`, `center`
chore: pin max pytest-retry plugin version in tests
docs: sample ML Drug Name Generation notebook
docs: add samples and best practices to `read_gbq` docs
chore: fix Python download path in docs-presubmit tests
perf: add local cache for `__repr_*__` methods
feat: support `DataFrame.pivot`
fix: don't use query cache for Session construction
feat: add `bigframes.pandas.read_pickle` function
feat: support MultiIndex for DataFrame columns
chore: change the docs kokoro setup to Gerrit path
docs: transform remote function user guide into sample code
fix: raise exception for invalid function in `read_gbq_function`
docs: add release status to table of contents
feat: add `fit_transform` to `bigquery.ml` transformers
feat: use `pandas.Index` for column labels
docs: add ML section under Overview
fix: check that types are specified in `read_gbq_function`
fix: add error message to `set_index`
and the `bigframes.ml.compose module <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.compose>`_.
85
+
BigQuery DataFrames offers the following transformations:
86
+
87
+
* Use the `OneHotEncoder class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.preprocessing.OneHotEncoder>`_
88
+
in the ``bigframes.ml.preprocessing`` module to transform categorical values into numeric format.
89
+
* Use the `StandardScaler class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.preprocessing.StandardScaler>`_
90
+
in the ``bigframes.ml.preprocessing`` module to standardize features by removing the mean and scaling to unit variance.
91
+
* Use the `ColumnTransformer class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.compose.ColumnTransformer>`_
92
+
in the ``bigframes.ml.compose`` module to apply transformers to DataFrames columns.
93
+
94
+
95
+
Train models
96
+
^^^^^^^^^^^^
97
+
98
+
Create estimators to train models in BigQuery DataFrames.
99
+
100
+
**Clustering models**
101
+
102
+
Create estimators for clustering models by using the
* Use the `KMeans class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.cluster.KMeans>`_
106
+
to create K-means clustering models. Use these models for
107
+
data segmentation. For example, identifying customer segments. K-means is an
108
+
unsupervised learning technique, so model training doesn't require labels or split
109
+
data for training or evaluation.
110
+
111
+
**Decomposition models**
112
+
113
+
Create estimators for decomposition models by using the `bigframes.ml.decomposition module <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.decomposition>`_.
114
+
115
+
* Use the `PCA class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.decomposition.PCA>`_
116
+
to create principal component analysis (PCA) models. Use these
117
+
models for computing principal components and using them to perform a change of
118
+
basis on the data. This provides dimensionality reduction by projecting each data
119
+
point onto only the first few principal components to obtain lower-dimensional
120
+
data while preserving as much of the data's variation as possible.
121
+
122
+
123
+
**Ensemble models**
124
+
125
+
Create estimators for ensemble models by using the `bigframes.ml.ensemble module <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.ensemble>`_.
126
+
127
+
* Use the `RandomForestClassifier class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.ensemble.RandomForestClassifier>`_
128
+
to create random forest classifier models. Use these models for constructing multiple
129
+
learning method decision trees for classification.
130
+
* Use the `RandomForestRegressor class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.ensemble.RandomForestRegressor>`_
131
+
to create random forest regression models. Use
132
+
these models for constructing multiple learning method decision trees for regression.
133
+
* Use the `XGBClassifier class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.ensemble.XGBClassifier>`_
134
+
to create gradient boosted tree classifier models. Use these models for additively
135
+
constructing multiple learning method decision trees for classification.
136
+
* Use the `XGBRegressor class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.ensemble.XGBRegressor>`_
137
+
to create gradient boosted tree regression models. Use these models for additively
138
+
constructing multiple learning method decision trees for regression.
139
+
140
+
141
+
**Forecasting models**
142
+
143
+
Create estimators for forecasting models by using the `bigframes.ml.forecasting module <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.forecasting>`_.
144
+
145
+
* Use the `ARIMAPlus class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.forecasting.ARIMAPlus>`_
146
+
to create time series forecasting models.
147
+
148
+
**Imported models**
149
+
150
+
Create estimators for imported models by using the `bigframes.ml.imported module <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.imported>`_.
151
+
152
+
* Use the `ONNXModel class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.imported.ONNXModel>`_
153
+
to import Open Neural Network Exchange (ONNX) models.
154
+
* Use the `TensorFlowModel class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.imported.TensorFlowModel>`_
155
+
to import TensorFlow models.
156
+
157
+
**Linear models**
158
+
159
+
Create estimators for linear models by using the `bigframes.ml.linear_model module <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.linear_model>`_.
160
+
161
+
* Use the `LinearRegression class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.linear_model.LinearRegression>`_
162
+
to create linear regression models. Use these models for forecasting. For example,
163
+
forecasting the sales of an item on a given day.
164
+
* Use the `LogisticRegression class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.linear_model.LogisticRegression>`_
165
+
to create logistic regression models. Use these models for the classification of two
166
+
or more possible values such as whether an input is ``low-value``, ``medium-value``,
167
+
or ``high-value``.
168
+
169
+
**Large language models**
170
+
171
+
Create estimators for LLMs by using the `bigframes.ml.llm module <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.llm>`_.
172
+
173
+
* Use the `PaLM2TextGenerator class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.llm.PaLM2TextGenerator>`_ to create PaLM2 text generator models. Use these models
174
+
for text generation tasks.
175
+
* Use the `PaLM2TextEmbeddingGenerator class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.llm.PaLM2TextEmbeddingGenerator>`_ to create PaLM2 text embedding generator models.
176
+
Use these models for text embedding generation tasks.
Pipelines let you assemble several ML steps to be cross-validated together while setting
185
+
different parameters. This simplifies your code, and allows you to deploy data preprocessing
186
+
steps and an estimator together.
187
+
188
+
* Use the `Pipeline class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.pipeline.Pipeline>`_
189
+
to create a pipeline of transforms with a final estimator.
0 commit comments