Amazon Redshift

The Amazon Redshift destination writes data to an Amazon Redshift table. Use the destination in Databricks, Dataproc, or EMR cluster pipelines.

The Amazon Redshift destination stages data on Amazon S3 before writing it to Redshift. When you configure the destination, you specify the staging location on S3. You can optionally enable server-side encryption and have the destination delete objects from S3 after writing them to Redshift. You can also use a connection connection connection to configure the destination.

You define the Amazon Redshift endpoint, schema, and table to write to. You can optionally define fields to partition by.

You specify the write mode to use: insert, merge, or delete. You can use compound keys for merges and deletes. When merging data, you can specify a distribution key to improve performance. When deleting data, you can specify the action to take for records with no match in the table.

You can optionally have the destination truncate the table before writing to it. You can also have the destination create the table. When creating the table, you specify the Redshift distribution style to use, and you specify the default length and any custom lengths that you want to use for Varchar fields.

You configure security credentials and the database user for the write. When using AWS access keys, you can have the destination automatically create the user. You can also configure advanced properties such as performance-related properties and proxy server properties.

Before using the Amazon Redshift destination, verify if you need to install a JDBC driver.

StreamSets has tested this destination with Amazon Redshift versions 5.13.0 and 5.29.0.