Advanced Search

Advanced Query Language

To perform an advanced search, you write a search query using the StreamSets advanced query language (SAQL).

A search query includes one or more conditions joined by an AND or OR logical operator. Each condition clause uses the following syntax:

<property><operator><value>

For example, the following query finds objects modified at any time on January 1, 2022:

modified_on=="2022-01-01"

Note: The advanced query language is based on RSQL.

The object that you are searching for, such as pipelines or jobs, determines the available search properties. The search property type determines the available operators.

The advanced query language allows you to specify criteria that cannot be defined in basic search, such as defining complex conditions or changing the order of precedence for multiple conditions. When needed, you can change to basic mode to perform a basic search.

You can save an advanced search for later use. You can also configure Control Hub to persist the last configured advanced search when you return to the view.

Multiple Condition Clauses

You can include multiple condition clauses joined by one of the following logical operators:
  • Logical AND - Use and or a semicolon (;).
  • Logical OR - Use or or a comma (,).
For example, both of the following queries use the logical AND operator to find pipelines with a name starting with lo and with a version of 1:
  • name==lo* and version==1
  • name==lo*;version==1

When you include multiple condition clauses, the AND operator takes precedence over OR. Control Hub first evaluates all clauses connected by an AND operator, and then evaluates each clause connected by an OR operator. You can use parentheses in the query to define a different order of precedence.

For example, let's say that you define the following query:

version==1 or version==2 and committed_on>="2022-06-01"

Control Hub evaluates the conditions in the following order:
  1. Finds pipelines with a version of 2 and that were committed on or after June 1, 2022.
  2. Finds pipelines with a version of 1 with no date restriction.

If you use the following parentheses in the query to define a different order of precedence:

(version==1 or version==2) and committed_on>="2022-06-01"

Control Hub evaluates the conditions in the following order:
  1. Finds pipelines committed with a version of 1 or 2.
  2. Within those search results, finds pipelines committed on or after June 1, 2022.

Search Properties

Each searchable object type includes a fixed set of properties that you create search conditions for. All search property names are case sensitive.

The property type determines the available operators and the required format for the search value.
Tip: To ensure that you use the correct property name, define your search in basic mode and then change to advanced mode.

Pipeline and Fragment Properties

A pipeline or fragment search finds the most recent version of each pipeline or fragment that matches all search conditions, which might not be the latest version of the pipeline or fragment. To find only the latest version, define a condition using the latest_version property. To find a specific version, define a condition using the version property.

You can use the following properties to define search conditions for pipelines and fragments:
Property Name Type Description
commit_message String Commit message for the pipeline or fragment.
commit_id String Pipeline or fragment commit ID.
committed_by String User who committed the pipeline or fragment.

Specify the user email address. For example: rita@mycompany.com

committed_on Datetime Date and time the pipeline or fragment was committed.
description String Pipeline or fragment description.
draft Boolean Indicates whether the pipeline or fragment is a draft version.
engine_type Enumerator Type of engine for the pipeline or fragment. You can specify the following fixed values:
  • COLLECTOR - For Data Collector pipelines or fragments.
  • TRANSFORMER - For Transformer pipelines or fragments.
  • SNOWPARK - For Transformer for Snowflake pipelines or fragments.
engine_version String Version of the authoring engine used to last modify the pipeline or fragment. For example, 5.0.0 or 5.1.0.
execution_mode Enumerator Execution mode of the pipeline or fragment, based on the selected engine type. You can specify the following fixed values:
  • STANDALONE - For Data Collector pipelines or fragments.
  • BATCH or STREAMING - For Transformer pipelines or fragments.
  • SNOWPARK - For Transformer for Snowflake pipelines or fragments.
label String Pipeline or fragment label.

Returns only published pipelines or fragments matching the specified label, not draft pipelines or fragments.

To search for nested labels, such as Test/ADLSGen2 or Test/Elasticsearch, see Nested Labels and Tags.

latest_version Boolean Indicates whether the pipeline or fragment version is the latest version.

When set to true, the condition restricts searches to the latest version of each pipeline or fragment.

When set to false, the condition restricts searches to the most recent version that is not the latest version.

For more information about searching for a pipeline or fragment version, see Pipeline and Fragment Versions.

library_name String Name of a stage library used in the pipeline or fragment.

For example, you might search for all fragments that use stages included in the Apache Kafka 3.1.0 stage library.

modified_by String User who last modified the pipeline or fragment.

Specify the user email address. For example: rita@mycompany.com

modified_on Datetime Date and time the pipeline or fragment was last modified.
name String Name of the pipeline or fragment.
pipeline_id String Pipeline or fragment ID.
stage_name String Name of a stage used in the pipeline or fragment.

For example, you might search for all pipelines that include an Amazon S3 stage.

version String Pipeline or fragment version. You can specify a draft version, such as 1-DRAFT. Or a published version, such as 3.
Note: On some Control Hub pages, the Version column includes a v prefix so that the version number displays as v1 or v3. On other pages, the Version column does not include the prefix so that the version number displays as 1 or 3. In basic searches, you can include or exclude the prefix. For example, you can search for either v1 or 1. In advanced searches, you must exclude the prefix. For example, you must search for 1.

For more information about searching for a pipeline or fragment version, see Pipeline and Fragment Versions.

Job Template Properties

You can use the following properties to define search conditions for job templates:
Property Name Type Description
archived Boolean Indicates whether the job template is archived.
created_by String User who created the job template.

Specify the user email address. For example: rita@mycompany.com

created_on Datetime Date and time the job template was created.
description String Job template description.
engine_label String Engine labels assigned to the job template.
engine_type Enumerator Type of engine for the job template. You can specify the following fixed values:
  • COLLECTOR - For Data Collector engines.
  • TRANSFORMER - For Transformer engines.
  • SNOWPARK - For Transformer for Snowflake engines.
failover Boolean Indicates whether pipeline failover is enabled for the job template.
id String Job template ID.
instances Integer Number of pipeline instances configured for the job template.

Available only for Data Collector job templates.

modified_by String User who last modified the job template.

Specify the user email address. For example: rita@mycompany.com

modified_on Datetime Date and time the job template was last modified.
name String Name of the job template.
pipeline_name String Name of the pipeline included in the job template.
pipeline_version String Version of the pipeline included in the job template. You can specify a draft version, such as v1-DRAFT. Or a published version, such as v3.
Note: On some Control Hub pages, the Version column includes a v prefix so that the version number displays as v1 or v3. On other pages, the Version column does not include the prefix so that the version number displays as 1 or 3. In basic searches, you can include or exclude the prefix. For example, you can search for v1 or 1. In advanced searches, you must include the prefix. For example, you must search for v1.
tag String Job template tag.

To search for nested tags, such as Test/ADLSGen2 or Test/Elasticsearch, see Nested Labels and Tags.

Job Instance Properties

You can use the following properties to define search conditions for job instances:
Property Name Type Description
attached_to_template Boolean Indicates whether the job instance is attached to a parent job template.

When set to true, the condition finds attached job instances created from job templates.

When set to false, the condition finds job instances created from pipelines and detached job instances created from job templates.

For more information, see Attached and Detached Job Instances.

color Enumerator Color of the job instance status. You can specify the following fixed values:
  • GREEN
  • GRAY
  • RED
created_by String User who created the job instance.

Specify the user email address. For example: rita@mycompany.com

created_on Datetime Date and time the job instance was created.
description String Job instance description.
engine_label String Engine labels assigned to the job instance.
engine_type Enumerator Type of engine for the job instance. You can specify the following fixed values:
  • COLLECTOR - For Data Collector engines.
  • TRANSFORMER - For Transformer engines.
  • SNOWPARK - For Transformer for Snowflake engines.
failover Boolean Indicates whether pipeline failover is enabled for the job instance.
id String Job instance ID.
instances Integer Number of pipeline instances configured for the job instance.

Available only for Data Collector job instances.

modified_by String User who last modified the job instance.

Specify the user email address. For example: rita@mycompany.com

modified_on Datetime Date and time the job instance was last modified.
name String Name of the job instance.
pipeline_name String Name of the pipeline included in the job instance.
pipeline_status Enumerator Status of the pipeline included in the job instance. For the list of fixed values that you can specify, see Pipeline Status.
Note: For Data Collector jobs that run multiple pipeline instances, all pipeline instances must have the same status. For example, let's say that you search for jobs that include a pipeline with the RUNNING status. The search does not find a job that has one pipeline instance with the RUNNING status and another with the RUN_ERROR status.
pipeline_version String Version of the pipeline included in the job instance. You can specify a draft version, such as v1-DRAFT. Or a published version, such as v3.
Note: On some Control Hub pages, the Version column includes a v prefix so that the version number displays as v1 or v3. On other pages, the Version column does not include the prefix so that the version number displays as 1 or 3. In basic searches, you can include or exclude the prefix. For example, you can search for v1 or 1. In advanced searches, you must include the prefix. For example, you must search for v1.
status Enumerator Job instance status. You can specify the following fixed values:
  • ACTIVATING
  • ACTIVATION_ERROR
  • ACTIVE
  • DEACTIVATING
  • INACTIVE
  • INACTIVE_ERROR
Note: The ACTIVATION_ERROR status is not used at this time.
tag String Job instance tag.

To search for nested tags, such as Test/ADLSGen2 or Test/Elasticsearch, see Nested Labels and Tags.

Draft Run Properties

You can use the following properties to define search conditions for draft runs:
Property Name Type Description
color Enumerator Status color of the job automatically created for the draft run. You can specify the following fixed values:
  • GREEN
  • GRAY
  • RED
created_by String User who created the draft run.

Specify the user email address. For example: rita@mycompany.com

created_on Datetime Date and time the draft run was created.
engine_type Enumerator Type of engine for the draft run. You can specify the following fixed values:
  • COLLECTOR - For Data Collector engines.
  • TRANSFORMER - For Transformer engines.
  • SNOWPARK - For Transformer for Snowflake engines.
id String ID of the job automatically created for the draft run.
modified_by String User who last modified the draft run.

Specify the user email address. For example: rita@mycompany.com

modified_on Datetime Date and time the draft run was last modified.
name String Name of the draft run.
pipeline_name String Name of the pipeline included in the draft run.
pipeline_status Enumerator Status of the pipeline included in the draft run. For the list of fixed values that you can specify, see Pipeline Status.
pipeline_version String Version of the pipeline included in the draft run. You can specify a draft version, such as v1-DRAFT.
Note: On some Control Hub pages, the Version column includes a v prefix so that the version number displays as v1-DRAFT or v3-DRAFT. On other pages, the Version column does not include the prefix so that the version number displays as 1-DRAFT or 3-DRAFT. In basic searches, you can include or exclude the prefix. For example, you can search for v1-DRAFT or 1-DRAFT. In advanced searches, you must include the prefix. For example, you must search for v1-DRAFT.
status Enumerator Status of the job automatically created for the draft run. You can specify the following fixed values:
  • ACTIVATING
  • ACTIVATION_ERROR
  • ACTIVE
  • DEACTIVATING
  • INACTIVE
  • INACTIVE_ERROR
Note: The ACTIVATION_ERROR status is not used at this time.

Sequence Properties

You can use the following properties to define search conditions for sequences:
Property Name Type Description
created_by_email String User who created the sequence.

Specify the user email address. For example: rita@mycompany.com

created_on Datetime Date and time the sequence was created.
description String Sequence description.
id String Sequence ID.
job_name String Name of a job included in the sequence.
last_modified_by_email String User who last modified the sequence.

Specify the user email address. For example: rita@mycompany.com

last_run_time Datetime Date and time the sequence was last run.
modified_on Datetime Date and time the sequence was last modified.
name String Name of the sequence.
next_run_time Datetime Date and time that the sequence is scheduled to next run.
status Enumerator Status of the sequence. You can specify the following fixed values:
  • ACTIVE
  • DISABLED
  • ERROR
  • INACTIVE
Note: Changes to the created_by_email, job_name, and last_modified_by_email properties are not immediately reflected in sequence searches. If expected sequences do not appear in your search results, wait a few minutes and then try again.

Operators

The operators that you can use in a condition depend on the property type. For example, you can use one set of operators to search for string properties, and another set to search for datetime properties.

The values that you define in a condition must be compatible with the property type and must be entered using the required format.
Tip: To ensure that you use the required format for values, define your search in basic mode and then change to advanced mode.

Boolean Operators

You can use the following operators for boolean properties:
  • Equal to: ==
  • Not equal to: !=
  • Empty or not empty: =null=

All operators for boolean properties accept only a true or false value. For example, the following condition finds all jobs enabled for pipeline failover:

failover==true

Datetime Operators

You can use the following operators for datetime properties:
  • Equal to: ==
  • Not equal to: !=
  • Less than: =lt= or <
  • Less than or equal to: =le= or <=
  • Greater than: =gt= or >
  • Greater than or equal to: =ge= or >=
  • Empty or not empty: =null=
Use the following guidelines to define datetime values:
Datetime formats
Use one of the following formats:
  • YYYY
  • YYYY-MM
  • YYYY-MM-DD
  • YYYY-MM-DD hh
  • YYYY-MM-DD hh:mm
  • YYYY-MM-DD hh:mm:ss
Enclose datetime values in quotation marks

Enclose datetime values in single or double quotation marks.

For example, the following condition finds all fragments modified at any time on January 1, 2022:

modified_on=="2022-01-01"

The following condition finds all fragments modified on or after 10 am on January 1, 2022:

modified_on>="2022-01-01 10:00:00"

Empty or not empty operator

The =null= operator accepts only a true or false value.

For example, the following condition finds all fragments that do not have a committed date:

committed_on=null=true

Enumerator Operators

An enumerator property allows you to select from a set of fixed values, such as the engine_type property.

You can use the following operators for enumerator properties:
  • Equal to: ==
  • Not equal to: !=
  • In: =in=
  • Not in: =out=
  • Empty or not empty: =null=
Use the following guidelines to define enumerator values:
Case sensitive

Enumerator values are case sensitive.

For example, to find pipelines that use the streaming execution mode, you must enter the value in all uppercase, as follows:

execution_mode==STREAMING

Multiple enumerator values

Use the =in= and =out= operators to search for multiple enumerator values. Enclose multiple enumerator values in parentheses and separate the values with commas.

For example, the following condition finds all Data Collector or Transformer pipelines:

engine_type=in=(COLLECTOR,TRANSFORMER)

Empty or not empty operator

The =null= operator accepts only a true or false value.

For example, the following condition finds all pipelines that have a defined execution mode:

execution_mode=null=false

Integer Operators

You can use the following operators for integer properties:
  • Equal to: ==
  • Not equal to: !=
  • Less than: =lt= or <
  • Less than or equal to: =le= or <=
  • Greater than: =gt= or >
  • Greater than or equal to: =ge= or >=
  • In: =in=
  • Not in: =out=
  • Empty or not empty: =null=

Use the following guidelines to define integer values:

Multiple integer values
Use the =in= and =out= operators to search for multiple integer values. Enclose multiple integer values in parentheses and separate the values with commas.
For example, the following condition finds all jobs configured to run 3 or 4 pipeline instances:

instances=in=(3,4)

Empty or not empty operator
The =null= operator accepts only a true or false value.
For example, the following condition finds all jobs that have a defined Number of Instances property:

instances=null=false

String Operators

You can use the following operators for string properties:
  • Equal to: ==
  • Not equal to: !=
  • Less than: =lt= or <
  • Less than or equal to: =le= or <=
  • Greater than: =gt= or >
  • Greater than or equal to: =ge= or >=
  • In: =in=
  • Not in: =out=
  • Empty or not empty: =null=
Use the following guidelines to define string values:
Case insensitive
String values are case insensitive.
For example, all of the following conditions find the Tutorial pipeline:
  • name==tutorial
  • name==TUTORIAL
  • name==tUtoRIaL
Enclose strings with spaces in quotation marks
Enclose a string that includes spaces in single or double quotation marks. For example:

name=="HDFS to ADLS2"

Asterisk wildcard
Include the asterisk wildcard (*) in string values to perform a multiple character wildcard search.

For example, the following condition finds the LogsToKafka, LogsToKinesis, and LogsToS3 pipelines:

name==logs*

You can also include the asterisk wildcard in the middle of a string value. For example, the following search condition finds the LogsToKafka and LogsToKinesis pipelines:

name==logs*k*

Escape special characters with a backslash
Escape special characters with a backslash (\). Special characters include single quotation marks ('), double quotation marks ("), and asterisks (*).
For example, both of the following conditions find the "Western Store's Data" pipeline:
  • name=='\"Western Store\'s Data\"'
  • name=="\"Western Store's Data\""
Multiple string values
Use the =in= and =out= operators to search for multiple string values. Enclose multiple string values in parentheses and separate the values with commas.
For example, the following condition finds all pipelines using the 5.0.0 or 5.1.0 engine version:

engine_version=in=(5.0.0, 5.1.0)

Less than and greater than operators
Use the less than (< and <=) and greater than (> and >=) operators to search for string values that are in a lower or higher alphabetic order, where a is the lowest and z is the highest.
For example, the following condition finds the LogsToS3 and Tutorial pipelines, but not the HDFS to ADLS2 pipeline:
name>LogsToKinesis
Empty or not empty operator
The =null= operator accepts only a true or false value.
For example, the following condition finds all fragments that have an empty description:

description=null=true

For details about searching for string values for job instance or job template tags or for pipeline or fragment labels and versions, see the following topics:

Query Examples

You might adapt the following advanced search queries for your use:
failover==true and status==INACTIVE
Finds jobs enabled for pipeline failover and that have an INACTIVE status.
name==DL0* and committed_on>"2021-06-05"
Finds fragments with a name that begins with DL0 and that were committed after June 5, 2021.
(modified_by==melinda@mycompany.com or modified_by==louis@mycompany.com) and modified_on>="2022-01-01"
Finds pipelines last modified by Melinda or Louis and last modified at any time on or after January 1, 2022.
engine_version==5.1.0 and (name==*snowflake* or name==oracle*)
Finds pipelines that use a 5.1.0 engine version and with a name that contains the string snowflake or a name that begins with the string oracle.
(engine_version==5.1.0 or description=="For testing*") and name==*kafka*
Finds pipelines that use a 5.1.0 engine version or that have a description beginning with the string For testing and that have a name that contains the string kafka.
engine_type=in=(COLLECTOR,TRANSFORMER) and engine_label==tutorial
Finds Data Collector and Transformer jobs that are assigned the tutorial engine label.