Google BigQuery (Enterprise)

Supported pipeline types:
  • Data Collector

The Google BigQuery (Enterprise) destination loads new data or change data capture (CDC) data to Google BigQuery. The destination can compensate for data drift to support loading to new or existing datasets, tables, and columns. For information about supported versions, see Supported Systems and VersionsSupported Systems and Versions in the Data Collector documentation.
Note: When you do not need to process CDC data or handle data drift, you can use the simpler Google BigQuery destination that writes to existing datasets and tables only.

To load data, the destination first stages the pipeline data in CSV files in a staging area in Google Cloud Storage. Then, the destination uses the BigQuery API to run a batch load job that copies the staged files into BigQuery.

When you configure the destination, you specify authentication information for Google BigQuery and for your Google Cloud Storage staging area. You can also use a connectionconnection to define the information required to connect to BigQuery and to the staging area. You can optionally configure the destination to connect to Google BigQuery through a proxy server.

You specify the name of the dataset and tables to load the data to. The destination loads data from record fields to table columns based on matching names.

You can configure the destination to compensate for data drift by creating new columns in existing tables when new fields appear in records or by creating new tables or datasets as needed. When creating new tables, you can configure how the destination partitions the tables.

You can configure the root field for the row, and any first-level fields that you want to exclude from the record. You can specify characters to represent null values.

When processing CDC data, you can specify the key columns in the BigQuery table that the destination uses to evaluate the merge condition.

You can configure the destination to replace missing fields or fields with invalid data types with user-defined default values. You can also configure the destination to replace newline characters and trim leading and trailing spaces.

Before you use the Google BigQuery (Enterprise) destination, you must complete some prerequisite tasks. The destination is available in the Google Enterprise stage library. install the Google stage library and complete other prerequisite tasks. The Google stage library is an Enterprise stage libraryEnterprise stage library. Releases of Enterprise stage libraries occur separately from Data Collector releases. For more information, see Enterprise Stage Libraries in the Data Collector documentation.