使用 TimesFM 单变量模型预测多个时序


本教程介绍了如何将 AI.FORECAST 函数与 BigQuery ML 的内置 TimesFM 单变量模型搭配使用,根据给定列的历史值来预测该列的未来值。

本教程使用来自公开的 bigquery-public-data.san_francisco_bikeshare.bikeshare_trips 表中的数据。

目标

本教程将引导您将 AI.FORECAST 函数与内置的 TimesFM 模型结合使用,以预测共享单车行程。前两个部分介绍了如何预测单个时间序列的结果并直观呈现结果。第三部分介绍了如何预测多个时间序列。

费用

本教程使用 Google Cloud的可计费组件,包括以下组件:

  • BigQuery
  • BigQuery ML

如需详细了解 BigQuery 费用,请参阅 BigQuery 价格页面。

如需详细了解 BigQuery ML 费用,请参阅 BigQuery ML 价格

准备工作

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Google Cloud project.

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Make sure that billing is enabled for your Google Cloud project.

  6. 新项目会自动启用 BigQuery。如需在现有项目中启用 BigQuery,请前往

    Enable the BigQuery API.

    Enable the API

预测单个共享单车行程时间序列

使用 AI.FORECAST 函数预测未来的时序值。

以下查询根据过去四个月的历史数据,预测下个月(约 720 小时)每小时的订阅者自行车共享行程数。confidence_level 参数指示查询会生成置信度为 95% 的预测区间。

请按照以下步骤使用 TimesFM 模型预测数据:

  1. 在 Google Cloud 控制台中,前往 BigQuery 页面。

    转到 BigQuery

  2. 在查询编辑器中,粘贴以下查询,然后点击运行

    SELECT *
    FROM
      AI.FORECAST(
        (
          SELECT TIMESTAMP_TRUNC(start_date, HOUR) as trip_hour, COUNT(*) as num_trips
    FROM `bigquery-public-data.san_francisco_bikeshare.bikeshare_trips`
    WHERE subscriber_type = 'Subscriber' AND start_date >= TIMESTAMP('2018-01-01')
    GROUP BY TIMESTAMP_TRUNC(start_date, HOUR)
        ),
        horizon => 720,
        confidence_level => 0.95,
        timestamp_col => 'trip_hour',
        data_col => 'num_trips');

    结果类似于以下内容:

    +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | forecast_timestamp      | forecast_value    | confidence_level | prediction_interval_lower_bound | prediction_interval_upper_bound | ai_forecast_status |
    +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | 2018-05-01 00:00:00 UTC | 26.3045959...     |            0.95  | 21.7088378...                   | 30.9003540...                   |                    |
    +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | 2018-05-01 01:00:00 UTC | 34.0890502...     |            0.95  | 2.47682913...                   | 65.7012714...                   |                    |
    +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | 2018-05-01 02:00:00 UTC | 24.2154693...     |            0.95  | 2.87621605...                   | 45.5547226...                   |                    |
    +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | ...                     | ...               |  ...             | ...                             |  ...                            |                    |
    +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+
    

将预测数据与输入数据进行比较

AI.FORECAST 函数输出与函数输入数据的一部分绘制到图表中,以查看它们之间的对比情况。

请按照以下步骤绘制函数输出图表:

  1. 在 Google Cloud 控制台中,前往 BigQuery 页面。

    转到 BigQuery

  2. 在查询编辑器中,粘贴以下查询,然后点击运行

    WITH historical AS (
    SELECT TIMESTAMP_TRUNC(start_date, HOUR) as trip_hour, COUNT(*) as num_trips
    FROM `bigquery-public-data.san_francisco_bikeshare.bikeshare_trips`
    WHERE subscriber_type = 'Subscriber' AND start_date >= TIMESTAMP('2018-01-01')
    GROUP BY TIMESTAMP_TRUNC(start_date, HOUR)
    ORDER BY TIMESTAMP_TRUNC(start_date, HOUR)
    )
    SELECT * 
    FROM 
    (
    (SELECT
        trip_hour as date,
        num_trips AS historical_value,
        NULL as forecast_value,
        'historical' as type,
        NULL as prediction_interval_low,
        NULL as prediction_interval_upper_bound
    FROM
        historical
    ORDER BY historical.trip_hour DESC
    LIMIT 400)
    UNION ALL
    (SELECT forecast_timestamp AS date,
            NULL as historical_value,
            forecast_value as forecast_value, 
            'forecast' as type, 
            prediction_interval_lower_bound, 
            prediction_interval_upper_bound
    FROM
        AI.FORECAST(
        (
        SELECT * FROM historical
        ),
        horizon => 720,
        confidence_level => 0.99,
        timestamp_col => 'trip_hour',
        data_col => 'num_trips')))
    ORDER BY date asc;
  3. 查询运行完毕后,点击查询结果窗格中的图表标签页。生成的图表如下所示:

    将 100 个时间点的输入数据与 AI.FORECAST 函数输出数据绘制到图表中,以评估它们的相似性。

    您可以看到,输入数据和预测数据显示了类似的共享单车使用情况。您还可以看到,随着预测时间点向未来延伸,预测区间的下限和上限会增大。

预测多个共享单车行程时序

以下查询根据过去四个月的历史数据,预测下个月(约 720 小时)每种订阅者类型的每小时骑行共享行程数。confidence_level 参数指示查询会生成置信度为 95% 的预测区间。

请按照以下步骤使用 TimesFM 模型预测数据:

  1. 在 Google Cloud 控制台中,前往 BigQuery 页面。

    转到 BigQuery

  2. 在查询编辑器中,粘贴以下查询,然后点击运行

    SELECT *
    FROM
      AI.FORECAST(
        (
          SELECT TIMESTAMP_TRUNC(start_date, HOUR) as trip_hour, subscriber_type, COUNT(*) as num_trips
          FROM `bigquery-public-data.san_francisco_bikeshare.bikeshare_trips`
          WHERE start_date >= TIMESTAMP('2018-01-01')
          GROUP BY TIMESTAMP_TRUNC(start_date, HOUR), subscriber_type
        ),
        horizon => 720,
        confidence_level => 0.95,
        timestamp_col => 'trip_hour',
        data_col => 'num_trips',
        id_cols => ['subscriber_type']);

    结果类似于以下内容:

    +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | subscriber_type     | forecast_timestamp       | forecast_value   | confidence_level | prediction_interval_lower_bound | prediction_interval_upper_bound | ai_forecast_status |
    +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | Subscriber          | 2018-05-01 00:00:00 UTC  | 26.3045959...    |            0.95  | 21.7088378...                   | 30.9003540...                   |                    |
    +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | Subscriber          |  2018-05-01 01:00:00 UTC | 34.0890502...    |            0.95  | 2.47682913...                   | 65.7012714...                   |                    |
    +---------------------+-------------------+------------------+-------------------------+---------------------------------+---------------------------------+--------------------+
    | Subscriber          |  2018-05-01 02:00:00 UTC | 24.2154693...    |            0.95  | 2.87621605...                   | 45.5547226...                   |                    |
    +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | ...                 | ...                      |  ...             | ...              | ...                             |  ...                            |                    |
    +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+
    

清理

为避免因本教程中使用的资源导致您的 Google Cloud 账号产生费用,请删除包含这些资源的项目,或者保留项目但删除各个资源。

删除项目

要删除项目,请执行以下操作:

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

后续步骤