PostgreSQL CDC Client

Supported pipeline types:
  • Data Collector

The PostgreSQL CDC Client origin processes Write-Ahead Logging (WAL) data to generate change data capture records for a PostgreSQL database. For information about supported versions, see Supported Systems and VersionsSupported Systems and Versions in the Data Collector documentation.

You might use this origin to perform database replication. You can use a separate pipeline with the JDBC Query Consumer or JDBC Multitable Consumer origin to read existing data. Then start a pipeline with the PostgreSQL CDC Client origin to process subsequent changes.

The PostgreSQL CDC Client generates a single record from each transaction. Since each transaction can include multiple CRUD operations, the PostgreSQL CDC Client origin can also include multiple operations in a record.

As a result, the origin does not write the CRUD operations to the sdc.operation.type record header attribute. Depending on your use case, you might use a scripting processor to convert the records as needed. Or, you might use a Field Pivoter and other processors to separate the data to create a record for each operation.

When you configure the PostgreSQL CDC Client, you configure the change capture details, such as the schema and tables to read from, the initial change to use, and the operations to include. You can also use a connectionconnection to configure the origin.

You define the name for the replication slot to be used, and specify whether to remove replication slots on close. You can also specify the behavior when the origin encounters an unsupported data type and include the data for those fields in the record as unparsed strings. When the source database has high-precision timestamps, you can configure the origin to write string values rather than datetime values to maintain the precision.

To determine how the origin connects to the database, you specify connection information, a query interval, number of retries, and any custom JDBC configuration properties that you need. You can configure advanced connection properties.

Note: The user that connects to the database must have the replication or superuser role.