Draft Runs
Section Contents
Draft Runs#
A draft run is the execution of a draft pipeline. Use draft runs for development purposes only. The SDK enables you to interact with draft runs on the Platform including starting a draft run, stopping a draft run, deleting a draft run, retrieving a draft run. For more details, refer to the StreamSets Platform Documentation for draft runs.
Note
A draft run can only run for a pipeline in draft mode.
Starting a Draft Run#
In the Platform UI, to start a draft run you have to visit the pipeline’s canvas:
In the SDK to create a new streamsets.sdk.sch_models.DraftRuns
object or restart an existing streamsets.sdk.sch_models.DraftRuns
and add it to the Platform, first retrieve the streamsets.sdk.sch_models.Pipeline
object that you wish to create the draft run for.
Next pass that object to the streamsets.sdk.ControlHub.start_draft_run()
:
pipeline = sch.pipelines.get(name='Test Pipeline')
sch.start_draft_run(pipeline=pipeline)
Retrieving an Existing Draft Run#
In the Platform UI, a list of draft runs can be found and further filtered:
The equivalent steps in the SDK to retrieve, you can reference the streamsets.sdk.ControlHub.draft_runs
attribute of your streamsets.sdk.ControlHub
instance.
You can also further filter and refine the Draft Runs returned based on attributes including job_name
, job_status
, and job_id
:
# Returns a list of all draft run instances in 'INACTIVE' status
sch.draft_runs.get_all(job_status='INACTIVE')
# Retrieve a particular draft run's based on name
draft_run = sch.draft_runs.get(job_name='job name')
# Returns a single draft run instance that matches the supplied ID
sch.draft_runs.get(id='acd9dea6-ffcd-4ae0-86e2-bef38c000304:9e1e3faa-ca28-4c05-9edb-3b18aaba4604')
Stopping a Draft Run#
When a draft run has served it’s purpose but may need to be restarted later, you can stop the draft run. In the Platform UI, stopping a draft run can be done as shown:
The SDK equivalent of stopping a draft run requires first retrieving the streamsets.sdk.sch_models.DraftRun
object, then passing that object to the streamsets.sdk.ControlHub.stop_draft_run()
method:
# Get draft run instance to stop
draft_run = sch.draft_runs.get(job_name='job name')
sch.stop_draft_run(draft_run)
Deleting a Draft Run#
When a draft run has served it’s purpose, and is no longer necessary you can delete the draft run. Deleting a draft run in the UI can be done similarly to stopping:
To delete a draft run in the SDK first retrieve the streamsets.sdk.sch_models.DraftRun
object next pass that object to the streamsets.sdk.ControlHub.delete_draft_run()
method:
# Get draft run instance to delete
draft_run = sch.draft_runs.get(job_name='job name')
sch.delete_draft_run(draft_run)
Snapshots#
Once a snapshot is captured, it’s possible to inspect what data was captured.
In the UI snapshots can be accessed from the pipeline canvas once the draft run is started as shown below:
Retrieving an Existing Snapshot#
To access snapshots for a draft run, reference the streamsets.sdk.sch_models.DraftRun.snapshots
attribute of a streamsets.sdk.sch_models.DraftRun
instance.
This will return a streamsets.sdk.utils.SeekableList
of streamsets.sdk.sch_models.Snapshot
objects that are in chronological order:
draft_run = sch.draftRuns.get(job_name='job name')
# Get all snapshots belonging to that draft run
snapshots = draft_run.snapshots
# Get a specific snapshot
snapshots = draft_run.snapshot.get(name="Snapshot3")
Capturing a Snapshot#
To generate a snapshot for an existing draft run reference the streamsets.sdk.sch_models.DraftRun.capture_snapshot()
, you could use the following steps:
draft_run = sch.draftRuns.get(job_name='job name')
draft_run.capture_snapshot()
Deleting a Snapshot#
If you’ve successfully generated a snapshot for a pipeline and no longer need to retain it, it can be deleted using the streamsets.sdk.sch_models.DraftRun.remove_snapshot()
method.
The method expects to receive a streamsets.sdk.sch_models.DraftRun
instance as an argument:
draft_run = sch.draftRuns.get(job_name='job name')
snapshot = draft_run.snapshots.get(name='snapshot_name')
draft_run.remove_snapshot(snapshot)
Logs#
Retrieving logs#
In the UI, you get view logs belonging to a draft run from the pipeline canvas:
In the SDK to view logs belonging to a draft run reference the streamsets.sdk.sch_models.DraftRun.get_logs()
, in return you get a list
of dict
instances, each a log line:
draft_run = sch.draftRuns.get(job_name='job name')
logs = draft_run.get_logs()
Bringing It All Together#
The complete scripts from this section can be found below.
# Start a draft run
pipeline = sch.pipelines.get(name='Test Pipeline')
sch.start_draft_run(pipeline=pipeline)
# Retrieve a particular draft run's based on name
draft_run = sch.draft_runs.get(job_name='Draft Run for Test Pipeline')
# Take Snapshot of draft run
draft_run.capture_snapshot()
# Get Snapshot
snapshot = draft_run.snapshots.get(name='Snapshot1')
# Remove snapshot
draft_run.remove_snapshot(snapshot)
# Get snapshot logs
draft_run.get_logs()
# Stop draft run instance
sch.stop_draft_run(draft_run)
# Delete Draft Run
sch.delete_draft_run(draft_run)