Splunk

The Splunk destination writes data to Splunk using the Splunk HTTP Event Collector (HEC). For information about supported versions, see Supported Systems and Versions.

The destination sends HTTP POST requests to the HEC endpoint using the JSON data format. The destination generates one HTTP request for each batch, sending multiple records at a time. Each record must contain the event data and optionally the event metadata in the format required by Splunk.

Before you configure the destination, you must complete several prerequisites including enabling HEC in Splunk and creating an HEC authentication token.

When you configure the Splunk destination, you supply the Splunk API endpoint and the HEC authentication token. You can configure the timeout, request transfer encoding, and authentication type. You can configure the destination to use the Gzip or Snappy compression format to write the data. You can optionally use an HTTP proxy and configure SSL/TLS properties.

You can also configure the destination to log request and response information.

Prerequisites

Before you can write to Splunk, you must complete the following prerequisites:
Enable HTTP Event Collector (HEC)
By default, HEC in Splunk is disabled. Enable HEC as described in the Splunk documentation.
Create an HTTP Event Collector (HEC) token
To send data to HEC, the Splunk destination must use a token to authenticate to the Splunk server on which HEC is running. Create the HEC token as described in the Splunk documentation.
When you configure the Splunk destination in Data Collector, you enter this token value.

Required Record Format

Splunk requires that the event data and metadata be correctly formatted in the record. If the record is formatted incorrectly, an error occurs and the destination fails to write to Splunk. When you design a pipeline with the Splunk destination, you must ensure that the record sent to the destination uses the required format.

The record must contain an /event field that contains the event data. The /event field can be a string, map, or list-map field. For more information, see Event data in the Splunk documentation.
Important: The Splunk destination does not support raw events. Events must be sent in the /event field.

The record can optionally contain event metadata fields. Splunk includes several pre-defined keys that can be included in the event metadata. Any metadata key-value pairs that are not included in the event are set to values defined for the token on the Splunk server. For a list of the keys that can be included in event metadata, see Event metadata in the Splunk documentation.

For example, the following record includes three of the keys that can be included in event metadata and an /event field using the Map data type:
{
    "time": 1437522387,
    "host": "myserver.example.com",
    "source": "myapp",
    "event": { 
        "message": "Here is my message",
        "severity": "INFO"
    }
}
The following record includes five of the keys that can be included in event metadata and an /event field using the String data type:
{
    "time": 1426279439, // epoch time
    "host": "localhost",
    "source": "datasource",
    "sourcetype": "txt",
    "index": "main",
    "event": "Here is my event" 
}

Logging Request and Response Data

The Splunk destination can log request and response data to the Data Collector log.

When enabling logging, you configure the following properties:

Verbosity
The type of data to include in logged messages:
  • Headers_Only - Includes request and response headers.
  • Payload_Text - Includes request and response headers as well as any text payloads.
  • Payload_Any - Includes request and response headers and the payload, regardless of type.
Log Level
The level of messages to include in the Data Collector log. When you select a level, higher level messages are also logged. That is, if you select the Warning log level, then Severe and Warning messages are written to the Data Collector log.
Note: The configured log level for Data Collector can limit the level of detail that is logged. For example, if you set the log level to Finest to log detailed trace information, but Data Collector is configured for ERROR, then the origin only writes Severe level messages.
The following table describes the stage log levels and the corresponding Data Collector log levels needed to enable the logging:
Stage Log Level Data Collector Description
Severe ERROR Only messages indicating serious failures.
Warning WARN Messages warning of potential problems.
Info INFO Informational messages.
Fine DEBUG Basic tracing information.
Finer DEBUG Detailed tracing information.
Finest TRACE Highly detailed tracing information.

The name of this stage logger is com.streamsets.http.RequestLogger.

Max entity size

The maximum size of message data to write to the log. Use to limit the volume of data written to the Data Collector log for any single message.

Configuring a Splunk Destination

Configure a Splunk destination to write data to Splunk using the Splunk HTTP Event Collector (HEC).

  1. In the Properties panel, on the General tab, configure the following properties:
    General Property Description
    Name Stage name.
    Description Optional description.
    Required Fields Fields that must include data for the record to be passed into the stage.
    Tip: You might include fields that the stage uses.

    Records that do not include all required fields are processed based on the error handling configured for the pipeline.

    Preconditions Conditions that must evaluate to TRUE to allow a record to enter the stage for processing. Click Add to create additional preconditions.

    Records that do not meet all preconditions are processed based on the error handling configured for the stage.

    On Record Error Error record handling for the stage:
    • Discard - Discards the record.
    • Send to Error - Sends the record to the pipeline for error handling.
    • Stop Pipeline - Stops the pipeline. Not valid for cluster pipelines.
  2. On the Splunk tab, configure the following properties:
    Splunk Property Description
    Splunk API Endpoint Splunk API endpoint entered in the following format:
    <protocol>://<host>:<port>
    For example:
    https://server.example.com:8088

    For more information on configuring the endpoint, see Send data to HEC in the Splunk documentation.

    Splunk Token Value of the HEC token that you created for the destination, as described in Prerequisites.
  3. On the HTTP tab, configure the following properties:
    HTTP Property Description
    Request Transfer Encoding Use one of the following encoding types:
    • Buffered - The standard transfer encoding type.
    • Chunked - Transfers data in chunks. Not supported by all servers.

    Default is Buffered.

    HTTP Compression Compression format for the messages:
    • None
    • Snappy
    • Gzip
    Connect Timeout Maximum number of milliseconds to wait for a connection.
    Read Timeout Maximum number of milliseconds to wait for data.
    Authentication Type Determines the authentication type used to connect to the server:
    • None - Performs no authentication.
    • Basic - Uses basic authentication. Requires a username and password.

      Use with HTTPS to avoid passing unencrypted credentials.

    • Digest - Uses digest authentication. Requires a username and password.
    • Universal - Makes an anonymous connection, then provides authentication credentials upon receiving a 401 status and a WWW-Authenticate header request.

      Requires a username and password associated with basic or digest authentication.

      Use only with servers that respond to this workflow.

    • OAuth - Uses OAuth 1.0 authentication. Requires OAuth credentials.
    Use Proxy Enables using an HTTP proxy to connect to the system.
  4. To use an HTTP proxy, on the Proxy tab, configure the following properties:
    Proxy Property Description
    Proxy URI Proxy URI.
    Username Proxy user name.
    Password Proxy password.
    Tip: To secure sensitive information such as user names and passwords, you can use runtime resources or credential stores.
  5. To use SSL/TLS, on the TLS tab, configure the following properties:
    TLS Property Description
    Use TLS Enables the use of TLS.
    Use Remote Keystore Enables loading the contents of the keystore from a remote credential store or from values entered in the stage properties. For more information, see Remote Keystore and Truststore.
    Private Key Private key used in the remote keystore. Enter a credential function that returns the key or enter the contents of the key.
    Certificate Chain Each PEM certificate used in the remote keystore. Enter a credential function that returns the certificate or enter the contents of the certificate.
    Keystore File

    Path to the local keystore file. Enter an absolute path to the file or enter the following expression to define the file stored in the Data Collector resources directory:

    ${runtime:resourcesDirPath()}/keystore.jks

    By default, no keystore is used.

    Keystore Type Type of keystore to use. Use one of the following types:
    • Java Keystore File (JKS)
    • PKCS #12 (p12 file)

    Default is Java Keystore File (JKS).

    Keystore Password

    Password to the keystore file. A password is optional, but recommended.

    Tip: To secure sensitive information such as passwords, you can use runtime resources or credential stores.
    Keystore Key Algorithm

    Algorithm to manage the keystore.

    Default is SunX509.

    Use Remote Truststore Enables loading the contents of the truststore from a remote credential store or from values entered in the stage properties. For more information, see Remote Keystore and Truststore.
    Trusted Certificates Each PEM certificate used in the remote truststore. Enter a credential function that returns the certificate or enter the contents of the certificate.

    Using simple or bulk edit mode, click the Add icon to add additional certificates.

    Truststore File

    Path to the local truststore file. Enter an absolute path to the file or enter the following expression to define the file stored in the Data Collector resources directory:

    ${runtime:resourcesDirPath()}/truststore.jks

    By default, no truststore is used.

    Truststore Type
    Type of truststore to use. Use one of the following types:
    • Java Keystore File (JKS)
    • PKCS #12 (p12 file)

    Default is Java Keystore File (JKS).

    Truststore Password

    Password to the truststore file. A password is optional, but recommended.

    Tip: To secure sensitive information such as passwords, you can use runtime resources or credential stores.
    Truststore Trust Algorithm

    Algorithm to manage the truststore.

    Default is SunX509.

    Use Default Protocols Uses the default TLSv1.2 transport layer security (TLS) protocol. To use a different protocol, clear this option.
    Transport Protocols TLS protocols to use. To use a protocol other than the default TLSv1.2, click the Add icon and enter the protocol name. You can use simple or bulk edit mode to add protocols.
    Note: Older protocols are not as secure as TLSv1.2.
    Use Default Cipher Suites Uses a default cipher suite for the SSL/TLS handshake. To use a different cipher suite, clear this option.
    Cipher Suites Cipher suites to use. To use a cipher suite that is not a part of the default set, click the Add icon and enter the name of the cipher suite. You can use simple or bulk edit mode to add cipher suites.

    Enter the Java Secure Socket Extension (JSSE) name for the additional cipher suites that you want to use.

  6. On the Logging tab, configure the following properties to log request and response data:
    Logging Property Description
    Enable Request Logging Enables logging request and response data.
    Log Level The level of detail to be logged. Choose one of the available options.
    The following list is in order of lowest to highest level of logging. When you select a level, messages generated by the levels above the selected level are also written to the log:
    • Severe - Only messages indicating serious failures.
    • Warning - Messages warning of potential problems.
    • Info - Informational messages.
    • Fine - Basic tracing information.
    • Finer - Detailed tracing information.
    • Finest - Highly detailed tracing information.
    Note: The log level configured for Data Collector can limit the level of messages that the stage writes. Verify that the Data Collector log level supports the level that you want to use.
    Verbosity
    The type of data to include in logged messages:
    • Headers_Only - Includes request and response headers.
    • Payload_Text - Includes request and response headers as well as any text payloads.
    • Payload_Any - Includes request and response headers and the payload, regardless of type.
    Max Entity Size

    The maximum size of message data to write to the log. Use to limit the volume of data written to the Data Collector log for any single message.