Scaling Out Active Jobs

Control Hub can automatically scale out pipeline processing for an active Data Collector or SDC Edge job when you set the number of pipeline instances to -1.

When the number of pipeline instances is set to -1, Control Hub runs one pipeline instance on each available Data Collector or SDC Edge. If the job is active while an additional Data Collector or SDC Edge becomes available, Control Hub automatically starts an additional pipeline instance on that execution engine.

For example, if three Data Collectors have all of the specified labels for the job when you start the job, Control Hub runs three pipeline instances, one on each Data Collector. If you register another Data Collector with the same labels as the currently running job, Control Hub automatically starts a fourth pipeline instance on that newly available Data Collector.

When the number of pipeline instances is set to any other value, you must synchronize the active job to start additional pipeline instances on newly available Data Collectors or Edge Data Collectors.

Control Hub can automatically scale out pipeline processing up to a maximum of 50 pipeline instances for an active job. If needed, you can change the default maximum value by modifying the jobs.max.instances.autoscale.limit property in the $DPM_CONF/jobrunner-app.properties configuration file.

Note: If you configure a job to automatically scale out pipeline processing, you cannot also configure the job for pipeline failover. Setting the number of pipeline instances to -1 runs an instance on each available Data Collector - which doesn't reserve an available Data Collector required as a backup for pipeline failover.