Balancing Jobs Enabled for Failover

You can balance active Data Collector jobs enabled for pipeline failover. To balance a job, Control Hub redistributes the pipeline load across available Data Collectors that are running the fewest number of pipelines and that have not exceeded any resource thresholdsresource thresholds.

When balancing an active job, Control Hub performs the following actions:
  • Automatically determines if the pipeline load is evenly distributed across available Data Collectors that have not exceeded any resource thresholds.

    If the pipeline load is evenly distributed, Control Hub does not continue with the remaining actions.

    If the pipeline load is not evenly distributed - meaning that an available Data Collector not currently running a pipeline instance for the job is running fewer pipelines than another Data Collector currently running a pipeline instance for the job - then Control Hub continues with the remaining actions.

  • Stops a running pipeline instance for the job on one or more Data Collectors.
  • Restarts the pipeline from the last-saved offset on a matching number of available Data Collectors that have not exceeded any resource thresholds.

In most cases, you'll balance a job after a pipeline failover occurs. However, you can balance a job enabled for pipeline failover anytime you notice that the pipeline load is not evenly distributed across available Data Collectors.

For example, let’s say that you run a job on a group of four Data Collectors assigned the WesternRegion label. You’ve enabled failover for the job and have set the Number of Instances property to two, reserving two of the Data Collectors for pipeline failover. When you start the job, a pipeline instance runs on Data Collector 1 and Data Collector 2 because they are currently running the fewest number of pipelines.

After a while, Data Collector 1 unexpectedly shuts down, causing the pipeline to fail over to Data Collector 3 which is already running two pipelines for two other jobs. When Data Collector 1 restarts, it does not immediately run any pipelines. However, Data Collector 3 is currently running three pipelines. You balance the job to redistribute the pipeline load. Control Hub automatically determines that Data Collector 1 is available and running the fewest number of pipelines. Control Hub stops the pipeline on Data Collector 3, and restarts the pipeline on Data Collector 1, starting from the last-saved offset.