Resource Thresholds

A deployment defines the maximum thresholds for the following engine resources:

All engine instances belonging to the deployment inherit the same resource threshold values. For advanced use cases, some organizations can override the resource thresholds for individual engine instances.

Control Hub monitors these resources for all engines. When starting, balancing, or synchronizing jobs for Data Collector pipelines, Control Hub starts new pipeline instances only on Data Collector engines that have not exceeded any resource thresholds. Similarly, when starting jobs for Transformer or Transformer for Snowflake pipelines, Control Hub starts new pipeline instances only on Transformer or deployed Transformer for Snowflake engines that have not exceeded any resource thresholds.

When multiple matching engines have not exceeded their resource thresholds, Control Hub prioritizes engines that are currently running the fewest number of pipelines. For example, you start a job and two matching Data Collector engines have the maximum CPU load set to 100%. Data Collector A is currently using 70% of the CPU and running 1 pipeline. Data Collector B is currently using 50% of the CPU and running 3 pipelines. Control Hub starts a pipeline instance for the job on Data Collector A.

When all matching engines have exceeded their resource thresholds, Control Hub randomly places jobs in a queue, giving the jobs a red active status. The job details display the following warning message:
JOBRUNNER_72 - Insufficient <engine type> resources to run job. All matching <engine type> [<URLs>] have reached their maximum CPU usage limits.

When a matching engine no longer exceeds its resource thresholds, Control Hub randomly assigns a job from the queue to that engine, changing the job status to green active and running a pipeline instance on that engine.

CPU Load

Control Hub calculates the maximum CPU load threshold based on the available CPU on the host machine.

For example, you set the Max CPU Load threshold to 80% when you configure the deployment. You deploy an engine to an on-premise machine with 12 CPU cores. When the engine consumes less than 80% of 12 CPU cores, or 9.6 CPU cores, Control Hub starts new pipeline instances on the engine. When the engine consumes 9.6 or more CPU cores, Control Hub does not start new pipeline instances on the engine.

An engine can exceed the configured resource threshold as it runs pipelines. For example with the above configuration, when the engine consumes 9 CPU cores, Control Hub starts a new pipeline instance on the engine so that the engine consumes a total of 11 CPU cores. All running pipeline instances continue, but Control Hub does not start any new pipeline instances on the engine.

Memory

Control Hub calculates the maximum memory threshold based on the configured Java heap size for the engine.

You configure both the Java heap size and the memory resource threshold in the following locations in the Configure Engine step when you configure the deployment:

To configure the Java heap size, click Click here to configure next to Advanced Configuration. Then, click Java Configuration. You can configure a percentage of available memory on the host machine or an absolute number as the Java heap size, as described in the following engine documentation:

To configure the maximum memory threshold, define a percentage in the Max Memory (%) property.

For example, you configure the deployment properties as follows:
  • Maximum Java heap size = 50%
  • Max Memory threshold = 80%

You deploy an engine as a tarball installation to an Amazon EC2 instance with 32 GiB of memory. Because the Java heap size is set to 50%, the engine can use a maximum of 16 GiB of memory. When the engine consumes less than 80% of 16 GiB, or 12.8 GiB of memory, Control Hub starts new pipeline instances on the engine. When the engine consumes 12.8 GiB or more of memory, Control Hub does not start new pipeline instances on the engine.

An engine can exceed the configured resource threshold as it runs pipelines. For example with the above configurations, when the engine consumes 12 GiB of memory, Control Hub starts a new pipeline instance on the engine so that the engine consumes a total of 14 GiB of memory. All running pipeline instances continue, but Control Hub does not start any new pipeline instances on the engine.

Number of Running Pipelines

Control Hub monitors the number of running pipeline instances on each engine.

For example, you set the Max Running Pipeline Count threshold to 10. When the engine is running 9 or fewer pipelines, Control Hub starts new pipeline instances on the engine. When the engine is running 10 pipeline instances, Control Hub does not start new pipeline instances on the engine.

Note: If you start multiple jobs at the exact same time using the scheduler or using the Control Hub REST API, the number of pipelines running on an engine can exceed the configured resource threshold. If exceeding the resource threshold is not acceptable, you can enable an organization property that synchronizes the start of multiple jobs.

Editing Resource Thresholds for a Deployment

You can edit the maximum resource thresholds for a deployment.

Note: When a new engine instance launches for the deployment, it may take a few minutes for Control Hub to apply the modified values to the new engine. In the meantime, the engine uses the default values.
  1. In the Control Hub Navigation panel, click Set Up > Deployments.
  2. In the Actions column of the deployment, click the More icon () and then click Edit.
  3. In the Edit Deployment dialog box, expand the Configure Deployment section.
  4. By default, the Java heap size for the engine is set to 50% of the available memory. In most cases, the default percentage value is sufficient. You can optionally modify the heap size as needed.
    1. Click Click here to configure next to Advanced Configuration.
    2. Click Java Configuration.
    3. Select the JVM memory strategy to use, and then configure the minimum and maximum Java heap size.
      For details about configuring the Java heap size, see the following engine documentation:
    4. Click Save.
  5. Modify the following threshold values:
    Threshold Description
    Max CPU Load (%)

    Maximum percentage of CPU on the host machine that an engine instance can use. When an engine equals or exceeds this threshold, Control Hub does not start new pipeline instances on the engine.

    Default is 80.

    Max Memory (%)

    Maximum percentage of the configured Java heap size that an engine instance can use. When an engine equals or exceeds this threshold, Control Hub does not start new pipeline instances on the engine.

    Default is 100.

    Max Running Pipeline Count

    Maximum number of pipelines that can be running on each engine instance. When an engine equals this threshold, Control Hub does not start new pipeline instances on the engine.

    Default is 1,000,000.

  6. Click Save.

Overriding Resource Thresholds

You define resource thresholds for a deployment, and then all engine instances belonging to that deployment inherit the same resource threshold values.

For advanced use cases, some organizations can override the resource thresholds for individual engine instances. However, when an engine restarts, overridden values are lost and the engine inherits the resource thresholds set for the deployment.

  1. In the Navigation panel, click Set Up > Engines.
  2. Click an engine type tab.
  3. Locate the engine that you want to configure.
  4. In the Actions column, click the More icon () and then click Edit.
  5. Select Override Resource Thresholds.
  6. Override the following threshold values:
    Threshold Description
    Max CPU Load (%)

    Maximum percentage of CPU on the host machine that an engine instance can use. When an engine equals or exceeds this threshold, Control Hub does not start new pipeline instances on the engine.

    Max Memory (%)

    Maximum percentage of the configured Java heap size that an engine instance can use. When an engine equals or exceeds this threshold, Control Hub does not start new pipeline instances on the engine.

    Max Running Pipeline Count

    Maximum number of pipelines that can be running on each engine instance. When an engine equals this threshold, Control Hub does not start new pipeline instances on the engine.

  7. Click Save.