Creating a Job for a Pipeline

A pipeline is the design of the dataflow. A job is the execution of the dataflow.

A job defines the pipeline to run and the Data Collectors, or Transformers that run the pipeline. When you create a job, you specify the published pipeline version to run and you select labels for the job. Labels indicate which group of engines should run the pipeline.

When you start a job that contains a Data Collector pipeline, Control Hub runs a remote pipeline instance on Data Collectors with matching labels. Similarly, when you start a job that contains a Transformer pipeline, Control Hub runs a remote pipeline instance on Transformers with matching labels.

When you create a job that includes a pipeline with runtime parameters, you can designate the job as a job template. A job template lets you run multiple job instances with different runtime parameter values from a single job definition.

For more information about jobs, see Jobs Overview.

In the Pipelines view, you can create a job for a single pipeline or for multiple pipelines at the same time.

  1. In the Navigation panel, click Pipeline Repository > Pipelines.
  2. To create a job for a single pipeline, hover over the pipeline that you want to create a job for, and then click the Create Job icon () next to the pipeline.
    Or to create jobs for multiple pipelines, select multiple pipelines in the list and then click the Create Job icon () at the top of the pipeline list.
  3. On the Add Job window, configure the following properties:
    Job Property Description
    Job Name Job name.
    Description Optional description of the job.
    Pipeline Published pipeline that you want to run.
    Pipeline Commit / Tag Pipeline commit or pipeline tag assigned to the published pipeline version that you want to run. You can create a job for any published pipeline version.

    By default, Control Hub displays the latest published pipeline version.

    Execution Engine Labels Label or labels that determine the group of execution engines that run the pipeline.

    Labels are case sensitive.

    Job Tags Tags that identify similar jobs or job templates. Use job tags to easily search and filter jobs and job templates.

    Enter nested tags using the following format:

    <tag1>/<tag2>/<tag3>

    Enable Job Template Enables the job to work as a job template. A job template lets you run multiple job instances with different runtime parameter values from a single job definition.

    Enable only for jobs that include pipelines that use runtime parameters.

    Note: You can enable a job to work as a job template only during job creation. You cannot enable an existing job to work as a job template.
    Statistics Refresh Interval (ms) Milliseconds to wait before automatically refreshing statistics when you monitor the job.

    The minimum and default value is 60,000 milliseconds.

    Enable Time Series Analysis Enables Control Hub to store time series data which you can analyze when you monitor the job.

    When time series analysis is disabled, you can still view the total record count and throughput for a job, but you cannot view the data over a period of time. For example, you can’t view the record count for the last five minutes or for the last hour.

    Number of Instances Number of pipeline instances to run for the job. Increase the value only when the pipeline is designed for scaling out.

    Default is 1, which runs one pipeline instance on an available Data Collector running the fewest number of pipelines. An available Data Collector is an engine assigned all labels specified for the job.

    Available for Data Collector jobs only.

    Enable Failover

    Enables Control Hub to restart a pipeline on another available engine when the original engine shuts down unexpectedly.

    Default is disabled.

    Control Hub manages pipeline failover differently based on the engine type, as described in the following topics:
    Failover Retries per Data Collector Maximum number of pipeline failover retries to attempt on each available Data Collector.

    When a Data Collector reaches the maximum number of failover retries, Control Hub does not attempt to restart additional failed pipelines for the job on that Data Collector.

    Use -1 to retry indefinitely.

    Available for Data Collector jobs when failover is enabled.

    Global Failover Retries Maximum number of pipeline failover retries to attempt across all available engines. When the maximum number of global failover retries is reached, Control Hub stops the job.

    Use -1 to retry indefinitely.

    Control Hub manages failover retries differently based on the engine type, as described in the following topics:

    Available when failover is enabled.

    Require Job Error Acknowledgement Requires that users acknowledge an inactive error status due to connectivity issues before the job can be restarted.
    Clear the property for a scheduled job so the job can automatically be restarted without requiring user intervention.
    Important: Clear the property with caution, as doing so might hide errors that the job has encountered.
    Pipeline Force Stop Timeout (ms) Number of milliseconds to wait before forcing remote pipeline instances to stop.

    In some situations when you stop a job, a remote pipeline instance can remain in a Stopping state. For example, if a scripting processor in the pipeline includes code with a timed wait or an infinite loop, the pipeline remains in a Stopping state until it is force stopped.

    Default is 120,000 milliseconds, or 2 minutes.

    Runtime Parameters Runtime parameter values to start the pipeline instances with. Overrides the default parameter values defined for the pipeline.

    Click Get Default Parameters to display the parameters and default values as defined in the pipeline, and then override the default values.

    You can configure parameter values using simple or bulk edit mode. In bulk edit mode, configure parameter values in JSON format.

  4. If creating a job for a single pipeline, click Add Another to add another job for the same pipeline. Configure the properties for the additional job.
    Click Save when you have finished configuring all jobs.
  5. If creating jobs for multiple pipelines, click Next to configure properties for the next job. When you finish configuring a job for each selected pipeline, click Create.

    Control Hub displays the job or job template in the Jobs view.