Google BigQuery

The Google BigQuery destination loads new data or change data capture (CDC) data to Google BigQuery. The destination can compensate for data drift to support loading to new or existing datasets, tables, and columns. For information about supported versions, see Supported Systems and VersionsSupported Systems and Versions in the Data Collector documentation.

To load data, the destination first stages the data in CSV or Avro files in a staging area in Google Cloud Storage. Then, the destination creates a BigQuery batch load job that copies the staged files into BigQuery. For information about the Google BigQuery quota policy for batch loading data, see the Google Cloud BigQuery documentation.

When you configure the destination, you specify authentication information for Google BigQuery and for your Google Cloud Storage staging area. You can also use a connection connection to define the information required to connect to BigQuery and to the staging area. You can optionally configure the destination to connect to Google BigQuery through a proxy server.

You specify the name of the dataset and tables to load the data to. The destination loads data from record fields to table columns based on matching names.

You can configure the destination to compensate for data drift by creating new columns in existing tables when new fields appear in records or by creating new tables or datasets as needed. When creating new tables, you can configure how the destination partitions the tables.

You can configure the root field for the row, and any first-level fields that you want to exclude from the record. You can specify characters to represent null values.

When processing CDC data, you can specify the key columns in the BigQuery table that the destination uses to evaluate the merge condition.

You can configure the destination to replace missing fields or fields with invalid data types with user-defined default values. You can also configure the destination to replace newline characters and trim leading and trailing spaces.

Before you use the Google BigQuery destination, you must complete some prerequisite tasks.