Troubleshooting

Use the following tips for help with job management:

I get the following aggregated metrics warning when I monitor a job:
Aggregated metrics for the job are not available as individual pipeline metrics are discarded.
The job includes a pipeline that is configured to discard statistics. To monitor aggregated statistics and metrics, you must configure the pipeline to write statistics to Control Hub or to another system.
Edit the pipeline to configure it to write statistics, then publish the updated pipeline. You'll need to stop the job, and then edit the job so that it includes the latest published version of the pipeline. Then, you can start the job again.
For more information, see Pipeline Statistics.
I get the following permission enforcement warning when I monitor a job:
Permission enforcement is enabled for this organization, but the following Data Collectors do not support it: <Data Collector URL>
Permission enforcement is enabled for your organization, but the job is active on a Data Collector version earlier than 2.4.0.0 that does not support pipeline permissions.
Data Collector version 2.4.0.0 introduces pipeline sharing and permissions. An earlier version of Data Collector can run pipelines for the job, but does not support permissions. To enforce pipeline permissions, upgrade to Data Collector version 2.4.0.0 or later.
A job fails to start with the following error message:
Number of instances: <number> are more than number of matching Data Collectors: <number>
The Number of Instances property for the job is set to a value greater than the number of available Data Collectors. When you start a job, Control Hub can run only a single pipeline instance on each available Data Collector.
Edit the job to set the number of instances to a value less than the number of available Data Collectors and then start the job again. For more information, see Number of Pipeline Instances.
A job has a red active status and displays the following message in the job details:
JOBRUNNER_72 - Insufficient <execution engine type> resources to run job. All matching <execution engine type> [<URLs>] have reached their maximum CPU usage limits.
Because all matching execution engines have exceeded their resource thresholds, Control Hub randomly places the job in a queue. When a matching execution engine no longer exceeds its resource thresholds, Control Hub randomly assigns a job from the queue to that engine, changing the job status to green active and running a pipeline instance on that engine.
If this issue persists, consider increasing the maximum resource thresholds for the execution engines, or assigning the job label to additional execution engines.
One of the remote pipeline instances run from my job has stopped. How can I view the logs for that pipeline?
When you monitor an active job, you can view the Data Collector or Transformer log to review log messages for pipelines running on the execution engine. For details, see Logs.
A Data Collector engine has suddenly lost its connection to Control Hub. What happens to the currently active jobs on that engine?
When a Data Collector engine running a pipeline loses its connection to Control Hub, the engine continues to remotely run the pipeline, temporarily saving the pipeline status and last-saved offset in data files on the engine machine.

If the engine reconnects to Control Hub before the maximum engine heartbeat interval expires, the engine reports the saved pipeline data to Control Hub. No data loss nor data duplication occurs.

If the maximum engine heartbeat interval expires before the engine reconnects to Control Hub, Control Hub considers the engine unresponsive. Control Hub handles jobs on unresponsive engines based on whether pipeline failover is enabled for the job. For details, see Jobs and Unresponsive Data Collector Engines.