Azure SQL
The Azure SQL destination writes data to a table in Azure SQL Database, Azure Synapse SQL Pool, or Microsoft SQL Server version 2008 or later.
The destination can perform a bulk copy when writing to the table. When performing a bulk copy, it can also create the table. When not performing a bulk copy, the destination can append data to or overwrite data in an existing table.
When you configure the Azure SQL destination, you specify the database URL, database name, and credentials. You can also define any additional JDBC configuration properties you want to use.
You specify the table to write to and whether to perform a bulk copy. When performing a bulk copy, you define related properties including the batch size and timeout. You can optionally configure additional bulk copy properties.
When not performing a bulk copy, you specify the mode the destination uses for writing to an existing table.
Before you use the Azure SQL destination, install a JDBC driver.
Installing the Microsoft JDBC Driver
Before using the Azure SQL destination, you must install a Microsoft JDBC driver for SQL Server 6.2 or later. Install the driver as an external library for the JDBC stage library.
Transformer bundles a JDBC driver into the launched Spark application so that the driver is available on each node in the cluster.
Partitioning
Spark runs a Transformer pipeline just as it runs any other application, splitting the data into partitions and performing operations on the partitions in parallel.
When the pipeline starts processing a new batch, Spark determines how to split pipeline data into initial partitions based on the origins in the pipeline. Spark uses these partitions for the rest of the pipeline processing, unless a processor causes Spark to shuffle the data.
When writing to a database table, Spark creates one connection to the database for each partition.
Write Mode
The Azure SQL destination uses the write mode to determine how to write to an existing table. You can configure a write mode when the destination is not configured to perform a bulk copy.
- Append rows to existing table
- The destination appends rows to the existing table.
- Overwrite existing table
- The destination removes all rows in the existing table before writing new
rows to the table. When you select overwrite mode, you also configure how
the destination removes rows from the table:
- Truncate all rows in the existing database table, and then write data to the table.
- Drop the existing table, recreate the table, and then write data to the table.