Performance Optimization
Use the following tips to optimize for performance and cost-effectiveness when using the Snowflake destination:
- Increase the batch size
- The maximum batch size is determined by the origin in the and typically has a default value of 1,000 records. To take advantage of Snowflake's bulk loading abilities, increase the maximum batch size in the origin to 20,000-50,000 records. Be sure to increase the Data Collector java heap sizejava heap size, as needed. For more information, see Java Heap Size in the Data Collector documentation.
- Configure runners to wait indefinitely when idle
- With the default configuration, a runner generates an empty batch
after waiting idly for 60 seconds. As a result, the destination continues to
execute metadata queries against Snowflake, even though no data needs to be
processed. To reduce Snowflake charges when a runner waits idly,
set the Runner Idle Time property to -1. This configures
runners to wait indefinitely when idle without generating empty batches,
which allows Snowflake to pause processing.Important: Configuring runners to wait indefinitely when idle is strongly recommended. Using the default runner idle time can result in unnecessary Snowflake resource consumption and runtime costs.
- Use multiple threads
- When writing to Snowflake using Snowpipe or the COPY command, you can use multiple threads to improve performance when you include a multithreaded origin in the . When Data Collector resources allow, using multiple threads enables processing multiple batches of data concurrently.
- Enable additional connections to Snowflake
- When writing to multiple Snowflake tables using the COPY or MERGE commands, increase the number of connections that the Snowflake destination makes to Snowflake. Each additional connection allows the destination to write to an additional table, concurrently.