JDBC Producer
Supported pipeline types:
|
When you configure JDBC Producer, you specify connection information, table name, and optionally define field mappings.
By default, JDBC Producer writes data to the table based on the matching field names. You can override the default field mappings by defining specific mappings. To determine which table rows to update or delete, the destination detects the list of primary key columns for the table, and then uses the fields mapped to those columns to match rows.
You can configure the stage to rollback an entire batch if an error occurs while writing part of the batch. You can also configure custom properties that your driver requires.
The JDBC Producer can use CRUD operations defined in the
sdc.operation.type
record header attribute to write
data. You can define a default operation for records without the header
attribute or value. You can also configure whether to use multi-row
operations for inserts and deletes, and how to handle records with
unsupported operations.
When processing data from a CDC-enabled origin, you can specify the origin change log to aid record processing. For information about Data Collector change data processing and a list of CDC-enabled origins, see Processing Changed Data.
To use a JDBC version older than 4.0, you can specify the driver class name and define a health check query.
You can also use a connection to configure the destination.
You can use the JDBC Producer as part of the Drift Synchronization Solution for PostgreSQL.
Database Vendors and Drivers
The JDBC Producer destination can write data to multiple database vendors.
Database Vendor | Supported Versions | Tested Versions |
---|---|---|
MySQL | MySQL 5.7 and later |
|
Oracle |
Also supported:
|
|
PostgreSQL | PostgreSQL 9.x and later |
|
Microsoft SQL Server |
|
|
Installing the JDBC Driver
For information about installing additional drivers, see Install External Libraries in the Data Collector documentation.
Define the CRUD Operation
The JDBC Producer destination can insert, update, or delete data. The destination writes the records based on the CRUD operation defined in a CRUD operation header attribute or in operation-related stage properties.
- CRUD operation header attribute
- You can define the CRUD
operation in a CRUD operation record header attribute. The
destination looks for the CRUD operation to use in the
sdc.operation.type
record header attribute. - Operation stage properties
- You define a default operation in the destination
properties. The destination uses the default operation when the
sdc.operation.type
record header attribute is not set.
Update and Delete Operations
For update and delete operations, the JDBC Producer destination automatically detects the primary key of the table and uses that key in the WHERE clause that updates or deletes rows. The destination supports compound primary keys, keys that consist of more than one column.
customer
, the
id
column is the primary key:
id | first | middle | last |
---|---|---|---|
1 | john | f | smith |
2 | john | m | doe |
3 | mary | jane | smith |
{
"id": 1,
"first": "john",
"middle": "m",
"last": "doe"
}
DELETE FROM customer WHERE id = 1
Note that the destination matches the row based on the primary key and not the other fields in the record.
Single and Multi-row Operations
JDBC Producer performs single-row operations by default. That is, it executes a SQL statement for each record. When supported by the destination database, you can configure JDBC Producer to perform multi-row operations. Depending on the sequence of the data, multi-row operations can improve pipeline performance.
When performing multi-row operations, JDBC Producer creates a single SQL statement for sequential insert rows and for sequential delete rows. JDBC Producer does not perform multi-row update operations. You can configure the Statement Parameter Limit property to limit the number of parameters in an insert operation - that is, you can limit the number of records included in an insert statement.
For example, say the pipeline generates three insert records, followed by two update records, and four delete records. If you enable multi-row operations and do not set a statement parameter limit, JDBC Producer generates a single insert SQL statement for the three insert records, two update statements - one for each of the update records, and a single delete statement for the four delete records. On the other hand, if you enable multi-row operations and set the statement parameter limit to two, JDBC Producer generates two insert SQL statements - one for two insert records and one for the third insert record, two update statements - one for each of the update records, and a single delete statement for the four delete records.
Error handling for multi-row operations depends on the database. If the database reports the individual record that causes an error in a multi-row statement, the stage sends that record to the error stream. If the database does not report which record causes an error, the stage sends all the records from the statement to the error stream.
INSERT INTO <table name> (<col1>, <col2>, <col3>)
VALUES (<record1 field1>,<record1 field2>,<record1 field3>),
(<r2 f1>,<r2 f2>,<r2 f3>), (<r3 f1>,<r3 f2>,<r3 f3>),...;
DELETE FROM <table name> WHERE <primary key> IN (<key1>, <key2>, <key3>,...);
DELETE FROM <table name> WHERE (<pkey1>, <pkey2>, <pkey3>)
IN ((<key1-1>, <key1-2>, <key1-3>),(<key2-1>, <key2-2>, <key2-2>),...);
Configuring a JDBC Producer
Configure the JDBC Producer to use JDBC to write data to a database table.