Overwrite Data

The Overwrite Data write mode removes all existing data from the table before writing new data, by default. You can use an overwrite condition to remove only parts of the existing table.

When writing to an existing table, the destination writes to the table using the existing partitioning. You can specify partition columns to use when the table is not partitioned or when it does not exist. If the table does not exist, the destination creates the table using the specified partition columns.

You can specify the behavior when the schema of the data differs from the schema of the table. The Overwrite Data write mode provides the following schema handling options:
  • Overwrite Schema - When the pipeline starts, the destination updates the table schema to match the schema of the first batch of data. The schema update occurs only once, at the beginning of the pipeline run. Subsequent records with incompatible schemas cause the pipeline to stop.

    Use the Overwrite Schema option to write data with a different schema to the table, while enforcing a consistent schema within the pipeline run.

  • Merge Schema - When the data contains unexpected fields, the destination creates matching columns in the table to enable writing the data. The destination does not delete columns from the table.

    You can optionally define an overwrite condition to overwrite only the data within specified partitions.

    Use the Merge Schema option to allow writing data with different schemas to the table throughout the pipeline run.

  • No Schema Update - The destination performs no updates to the schema. Data with unexpected fields or data types causes the pipeline to stop.

    You can optionally define an overwrite condition to overwrite only the data within specified partitions.

    Use the No Schema Update option to ensure that all data has a compatible schema.

For example, say you want to drop all existing data in the table before writing to it. The table is not partitioned, but you want to use the region column as a partition. Also, if records include new fields, you want the destination to update the table schema to allow writing those records to the table.

You can use the following configuration to achieve this behavior: