Defining the CRUD Operation for CDC Data

When you configure the Databricks Delta Lake destination to use the MERGE command to load CDC data, the destination can insert, update, upsert, or delete data.

When writing CDC data, the destination uses the CRUD operation specified in the sdc.operation.type record header attribute. The destination performs operations based on the following numeric values:
  • 1 for INSERT
  • 2 for DELETE
  • 3 for UPDATE
  • 4 for UPSERT

If your pipeline includes a CRUD-enabled origin that processes changed data, the destination simply reads the operation type from the sdc.operation.type header attribute that the origin generates. If your pipeline uses a non-CDC origin, you can use the Expression Evaluator or a scripting processor to define the record header attribute. For more information about Data Collector changed data processing and a list of CDC-enabled origins, see Processing Changed Data.