-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Versions:
google-cloud-bigquery==1.19.0
We were initially using 1.18.0 but I noticed #9044 was included in 1.19.0 so we tried that as well, but it made no difference.
We're using a pandas dataframe read from parquet, example data would be
id status created_at execution_date
0 1 NEW 2018-09-12 09:24:31.291 2019-05-10 17:40:00
1 2 NEW 2018-09-12 09:26:45.890 2019-05-10 17:40:00
[628 rows x 4 columns]
df.dtypes
shows:
id object
status object
created_at datetime64[ns]
dataplatform_execution_date datetime64[ns]
dtype: object
When trying to load this into BigQuery using load_table_from_dataframe()
and setting the job config's time_partitioning
to bigquery.table.TimePartitioning(field="execution_date")
we get the following error:
The field specified for time partitioning can only be of type TIMESTAMP or DATE. The type found is: INTEGER.
Which doesn't really make sense, since the field is clearly a datetime64
.
The job config shown in the console looks correct (ie it's set to partitioned by day and it's using the correct field).
edit:
It seems the cause for this is that Dataframe columns of type datetime64
are being converted to type INTEGER
instead of DATE
(or TIMESTAMP
? I'm not sure which one would be the correct type in BigQuery).
edit2:
Could it be this mapping is wrong and it should be DATE
instead of DATETIME
?
"datetime64[ns]": "DATETIME", |