Control Hub API

Supported pipeline types:
  • Data Collector

The Control Hub API processor sends requests to the Control Hub REST API and writes data from the response to a specified output field.

The Control Hub API processor is an orchestration stage that you use in orchestration pipelines. Orchestration stages perform tasks, such as schedule and start pipelines and Control Hub jobs, that you can use to create an orchestrated workflow across the StreamSets platform. For example, an orchestration pipeline can use the Cron Scheduler origin to generate a record every weekday at 5 PM and trigger the Control Hub API processor, which calls a Control Hub REST API to stop a Control Hub job that runs during business hours.

When you configure the Control Hub API processor, you specify the URL for the API, the field to write the response to, any headers to include with the request, and the method to use. The method can specify a standard HTTP request or an expression that determines the request for each record. For some methods, you can specify data to include with the request.

You can configure the processor to log request and response information.

You can also configure the timeout and maximum number of parallel requests. You can optionally use an HTTP proxy and configure SSL/TLS properties.

Finding the URL to a Control Hub REST API

Each Control Hub REST API has a unique URL. The URL contains the Control Hub URL and the path to the API.

You can find the URL in Control Hub.

  1. From the Help menu, click RESTful API.
    The page lists the APIs by category.
  2. In the list of APIs, expand the API to view API details, including the URL.

HTTP Method

The Control Hub API processor supports the following HTTP methods:
  • GET
  • PUT
  • POST
  • DELETE
  • Expression - An expression that evaluates to one of the other methods.

Specify a method valid for the API call.

You can use the Expression method to set the HTTP method based on the data in a field. For example, the same API call, /pipelinestore/rest/v1/pipelineCommit/{commitId}, can either retrieve, save, or delete a pipeline commit, depending on the specified method. Therefore, you might configure the Control Hub API processor to use that API call with an expression that returns either the GET, POST, or DELETE method based on data in a field.

Parallel Requests

The Control Hub API processor can send multiple requests simultaneously.

To preserve record order, the processor waits until all the requests from an entire batch are completed before sending requests from the next batch.

You can specify the maximum number of parallel requests. Default is 1. Increasing the number of parallel requests can improve performance but increases the load on the Control Hub server. Network latency can also significantly impact the performance of this processor.

Logging Request and Response Data

The Control Hub API processor can log request and response data to the Data Collector log.

When enabling logging, you configure the following properties:

Verbosity
The type of data to include in logged messages:
  • Headers_Only - Includes request and response headers.
  • Payload_Text - Includes request and response headers as well as any text payloads.
  • Payload_Any - Includes request and response headers and the payload, regardless of type.
Log Level
The level of messages to include in the Data Collector log. When you select a level, higher level messages are also logged. That is, if you select the Warning log level, then Severe and Warning messages are written to the Data Collector log.
Note: The configured log level for Data Collector can limit the level of detail that is logged. For example, if you set the log level to Finest to log detailed trace information, but Data Collector is configured for ERROR, then the origin only writes Severe level messages.
The following table describes the stage log levels and the corresponding Data Collector log levels needed to enable the logging:
Stage Log Level Data Collector Description
Severe ERROR Only messages indicating serious failures.
Warning WARN Messages warning of potential problems.
Info INFO Informational messages.
Fine DEBUG Basic tracing information.
Finer DEBUG Detailed tracing information.
Finest TRACE Highly detailed tracing information.

The name of this stage logger is com.streamsets.http.RequestLogger.

Max entity size

The maximum size of message data to write to the log. Use to limit the volume of data written to the Data Collector log for any single message.

Generated Record

The Control Hub API processor updates the orchestration record that it receives with the response from the Control Hub REST API. The response is placed in the output field specified in the stage properties. The fields included in the response depends on the API that the Control Hub API processor calls.

For example, the following preview shows the currentStatus field that was added to the orchestration record by the Control Hub API processor. The field contains the response from a currentStatus API call. The call requests the status of the job that completed earlier in the orchestration pipeline and returns additional information about the job:

Configuring a Control Hub API Processor

Configure a Control Hub API processor to call a Control Hub REST API. The Control Hub API processor is an orchestration stage that you use in orchestration pipelines.

  1. In the Properties panel, on the General tab, configure the following properties:
    General Property Description
    Name Stage name.
    Description Optional description.
    Required Fields Fields that must include data for the record to be passed into the stage.
    Tip: You might include fields that the stage uses.

    Records that do not include all required fields are processed based on the error handling configured for the pipeline.

    Preconditions Conditions that must evaluate to TRUE to allow a record to enter the stage for processing. Click Add to create additional preconditions.

    Records that do not meet all preconditions are processed based on the error handling configured for the stage.

    On Record Error Error record handling for the stage:
    • Discard - Discards the record.
    • Send to Error - Sends the record to the pipeline for error handling.
    • Stop Pipeline - Stops the pipeline. Not valid for cluster pipelines.
  2. On the HTTP tab, configure the following properties:
    HTTP Property Description
    Control Hub URL URL to the Control Hub REST API. Enter the Control Hub URL and path to a specific API.
    Output Field Field to store the response. You can use a new or existing field.
    Headers Headers to include in the request. In simple or bulk edit mode, click the Add icon to add additional headers.
    HTTP Method HTTP method for the request. Select one of the standard methods or select Expression to enter an expression.
    HTTP Method Expression Expression that evaluates to a standard HTTP method.

    Available when HTTP Method is set to Expression.

    Request Data Data to include with the request. Available for the PUT, POST, DELETE, and Expression methods.
    Use Proxy Enables using an HTTP proxy to connect to Control Hub.
    Connection Timeout Maximum number of milliseconds to wait for a connection.
    Read Timeout Maximum number of milliseconds to wait for data.
    Maximum Parallel Requests Maximum number of requests to send to the server at one time.
    Maximum Request Time Maximum number of seconds to wait for a request to complete.
  3. On the Credentials tab, configure the following properties:
    Credentials Property Description
    Authentication Type Method for specifying authentication details:
    • User & Password (SCH 3.x only) - Use when Data Collector is registered with Control Hub cloud or Control Hub on-premises version 3.x.
    • API User Credentials - Use when Data Collector is deployed from Control Hub in the StreamSets platform.
    User Name Control Hub user that calls the API. Enter in the following format:
    <ID>@<organization ID>

    Available when Authentication Type is set to User & Password.

    Auth ID ID of a Control Hub API credential for someone authorized to call the API.

    For information on creating a Control Hub API credential, see the Control Hub documentation.

    Available when Authentication Type is set to API User Credentials.

    Password Password for the specified Control Hub user or the token for the specified Control Hub API credential.
    Tip: To secure sensitive information such as user names and passwords, you can use runtime resources or credential stores.
  4. To use an HTTP proxy, on the Proxy tab, configure the following properties:
    Proxy Property Description
    Proxy URI Proxy URI.
    Username Proxy user name.
    Password Proxy password.
    Tip: To secure sensitive information such as user names and passwords, you can use runtime resources or credential stores.
  5. To use SSL/TLS, click the TLS tab and configure the following properties.
    TLS Property Description
    Use TLS Enables the use of TLS.
    Use Remote Keystore Enables loading the contents of the keystore from a remote credential store or from values entered in the stage properties. For more information, see Remote Keystore and Truststore.
    Private Key Private key used in the remote keystore. Enter a credential function that returns the key or enter the contents of the key.
    Certificate Chain Each PEM certificate used in the remote keystore. Enter a credential function that returns the certificate or enter the contents of the certificate.

    Using simple or bulk edit mode, click the Add icon to add additional certificates.

    Keystore File

    Path to the local keystore file. Enter an absolute path to the file or enter the following expression to define the file stored in the Data Collector resources directory:

    ${runtime:resourcesDirPath()}/keystore.jks

    By default, no keystore is used.

    Keystore Type Type of keystore to use. Use one of the following types:
    • Java Keystore File (JKS)
    • PKCS #12 (p12 file)

    Default is Java Keystore File (JKS).

    Keystore Password Password to the keystore file. A password is optional, but recommended.
    Tip: To secure sensitive information such as passwords, you can use runtime resources or credential stores.
    Keystore Key Algorithm Algorithm to manage the keystore.

    Default is SunX509.

    Use Remote Truststore Enables loading the contents of the truststore from a remote credential store or from values entered in the stage properties. For more information, see Remote Keystore and Truststore.
    Trusted Certificates Each PEM certificate used in the remote truststore. Enter a credential function that returns the certificate or enter the contents of the certificate.

    Using simple or bulk edit mode, click the Add icon to add additional certificates.

    Truststore File

    Path to the local truststore file. Enter an absolute path to the file or enter the following expression to define the file stored in the Data Collector resources directory:

    ${runtime:resourcesDirPath()}/truststore.jks

    By default, no truststore is used.

    Truststore Type Type of truststore to use. Use one of the following types:
    • Java Keystore File (JKS)
    • PKCS #12 (p12 file)

    Default is Java Keystore File (JKS).

    Truststore Password Password to the truststore file. A password is optional, but recommended.
    Tip: To secure sensitive information such as passwords, you can use runtime resources or credential stores.
    Truststore Trust Algorithm Algorithm to manage the truststore.

    Default is SunX509.

    Use Default Protocols Uses the default TLSv1.2 transport layer security (TLS) protocol. To use a different protocol, clear this option.
    Transport Protocols TLS protocols to use. To use a protocol other than the default TLSv1.2, click the Add icon and enter the protocol name. You can use simple or bulk edit mode to add protocols.
    Note: Older protocols are not as secure as TLSv1.2.
    Use Default Cipher Suites Uses a default cipher suite for the SSL/TLS handshake. To use a different cipher suite, clear this option.
    Cipher Suites Cipher suites to use. To use a cipher suite that is not a part of the default set, click the Add icon and enter the name of the cipher suite. You can use simple or bulk edit mode to add cipher suites.

    Enter the Java Secure Socket Extension (JSSE) name for the additional cipher suites that you want to use.

  6. On the Logging tab, configure the following properties to log request and response data:
    Logging Property Description
    Enable Request Logging Enables logging request and response data.
    Log Level The level of detail to be logged. Choose one of the available options.
    The following list is in order of lowest to highest level of logging. When you select a level, messages generated by the levels above the selected level are also written to the log:
    • Severe - Only messages indicating serious failures.
    • Warning - Messages warning of potential problems.
    • Info - Informational messages.
    • Fine - Basic tracing information.
    • Finer - Detailed tracing information.
    • Finest - Highly detailed tracing information.
    Note: The log level configured for Data Collector can limit the level of messages that the stage writes. Verify that the Data Collector log level supports the level that you want to use.
    Verbosity
    The type of data to include in logged messages:
    • Headers_Only - Includes request and response headers.
    • Payload_Text - Includes request and response headers as well as any text payloads.
    • Payload_Any - Includes request and response headers and the payload, regardless of type.
    Max Entity Size

    The maximum size of message data to write to the log. Use to limit the volume of data written to the Data Collector log for any single message.