Failover Retries
When a Data Collector job is enabled for failover, Control Hub retries the failover an infinite number of times by default. If you want the failover to stop after a given number of retries, define the maximum number of retries to perform.
- Failover Retries per Data Collector
- Maximum number of pipeline failover retries to attempt on each available Data Collector. The initial start of a pipeline instance on a Data Collector counts as the first retry attempt.
- Global Failover Retries
- Maximum number of pipeline failover retries to attempt across all available Data Collectors.
Control Hub increments the failover retry count and applies the retry limit only when the pipeline encounters an error and transitions to a Start_Error or Run_Error state. If the engine running the pipeline shuts down, failover always occurs and Control Hub does not increment the failover retry count.
Example for Failover Retries per Data Collector
Let's look at an example of how Control Hub maintains the Failover Retries per Data Collector property.
- Control Hub sends one pipeline instance to Data Collector A and another to Data Collector B.
Data Collector C and Data Collector D serve as backups.
- After some time, the pipeline on Data Collector A fails.
- Control Hub attempts to restart the failed pipeline on Data Collector C, but the failover attempt fails. Control Hub increments the failover attempt to one for Data Collector C, and then successfully restarts the failed pipeline on Data Collector D.
- After additional time, the pipeline on Data Collector B fails.
- Control Hub attempts to restart the failed pipeline on Data Collector C, but the failover attempt fails. Control Hub increments the failover attempt to two for Data Collector C, and then successfully restarts the failed pipeline on Data Collector A.
Since Data Collector C has reached the maximum number of failover attempts, Control Hub does not attempt to restart additional pipelines for this job on Data Collector C.