Scheduled Task Types
- Job
- Report
Jobs
Use the scheduler to start, stop, or upgrade jobs on a regular basis. A single scheduled task cannot complete multiple types of actions.
When you define a scheduled task for a job, you specify one of the following actions that the task completes:
- Start
- Starts the job at the specified frequency.
- Stop
- Stops the job at the specified frequency.
- Upgrade
- Upgrades the job to use the latest pipeline version at the specified frequency.
If a scheduled task triggers a job start when the job is already active, a job stop when the job is already inactive, or a job upgrade when no later pipeline version exists, then no action is performed. The scheduled task simply logs that it was not able to start, stop, or upgrade the job. The task then continues running until the next scheduled time when it triggers another job start, stop, or upgrade.
Batch and Streaming Jobs
- Batch job
- A batch job includes a pipeline that processes all available data, and then stops. Create schedules for batch jobs to start the jobs on a regular basis.
- Streaming job
-
A streaming job includes a pipeline that maintains a connection to the origin system and processes data as it becomes available. The pipeline runs continuously until you manually stop it because you expect data to continuously arrive. In most cases, there's no need to schedule streaming jobs.
However, you might want to schedule a streaming job so that the job initially starts at some point in the future. For example, you want to schedule a job to initially start next Saturday at midnight when no DevOps engineer is available to manually start the job.
In this case, you would schedule the start of the streaming job as a one-time event.
Or, you might want to schedule a streaming job to start and stop on a regular basis. For example, you want to run a streaming job continuously every day of the week except for Sunday. You create one scheduled task that starts the job every Monday at 12:00 AM. Then, you create another scheduled task that stops the same job every Sunday at 12:00 AM. The next Monday at 12:00 AM, the scheduler starts the job again so that the pipeline can continue running.
In this case, you would schedule both the start and the stop of the streaming job as recurring events.
Reports
Use the scheduler to schedule the generation of data delivery reports on a regular basis.
Data delivery reports present data ingestion metrics for a given job or topology. For example, you can schedule a daily report that generates the number of records that processed by a job the previous day.