Data Collector Pipeline Failover
You can enable a Data Collector or Data Collector Edge job for pipeline failover. Enable pipeline failover to minimize downtime due to unexpected pipeline failures and to help you achieve high availability.
When a job is enabled for failover, Control Hub can restart a failed pipeline on another available execution engine that is assigned all labels specified for the job, starting from the last-saved offset.
- The Data Collector running the pipeline unexpectedly shuts down.
- The pipeline has reached the maximum number of retry attempts after encountering an error and has transitioned to a Start_Error or Run_Error state.
An available Data Collector includes any Data Collector that is assigned all labels specified for the job, that is not currently running a pipeline instance for the job, and that has not exceeded any resource thresholds. When multiple Data Collectors are available, Control Hub prioritizes Data Collectors that have not previously failed the pipeline and Data Collectors that are currently running the fewest number of pipelines.
For example, you enable a job for failover, set the number of pipeline instances to one, and then start the job on a group of three Data Collectors. Control Hub initially sends the pipeline instance to Data Collector A, but the pipeline fails on Data Collector A. At the time of failover, Data Collector A is running no other pipelines, Data Collector B is running one other pipeline, and Data Collector C is running two other pipelines. Control Hub restarts the failed pipeline on Data Collector B. If all three Data Collectors had already failed the pipeline, then Control Hub would restart the failed pipeline on Data Collector A.