Overwrite Partition Requirement

When writing to partitioned files, the File destination can overwrite files within affected partitions rather than overwriting the entire data set. For example, if output data includes only data within a 03-2019 partition, then the destination can overwrite the files in the 03-2019 partition and leave all other partitions untouched.

To overwrite partitioned files, Spark must be configured to allow overwriting data within a partition. When writing to unpartitioned files, no action is needed.

To enable overwriting partitions, set the spark.sql.sources.partitionOverwriteMode Spark configuration property to dynamic.

You can configure the property in Spark, or you can configure the property in individual pipelines. Configure the property in Spark when you want to enable overwriting partitions for all Transformer pipelines.

To enable overwriting partitions for an individual pipeline, add an extra Spark configuration property on the Cluster tab of the pipeline properties.