Functions

When using a StreamSets function, you can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The following table lists all available functions. For details about each function, see the related function type:

Function Type Functions
Base64 functions
  • base64:decodeBytes(<string>)
  • base64:decodeString(<string>, <charset>)
  • base64:encodeBytes(<byte array>, <urlSafe: true | false>)
  • base64:encodeString(<string>, <urlSafe: true | false>, <charset>)
Batch functions
  • batch:attribute(<attribute name>)
  • batch:attributeOrDefault(<attribute name>, <default value>)
Credential functions
  • credential:get(<cstoreId>, <userGroup>, <name>)
  • credential:getWithOptions(<cstoreId>, <userGroup>, <name>, <storeOptions>)
File functions
  • file:fileExtension(<filepath>)
  • file:fileName(<filepath>)
  • file:parentPath(<filepath>)
  • file:pathElement(<filepath>, <integer>)

  • file:removeExtension(<filepath>)
Job functions
  • job:id()
  • job:name()
  • job:startTime()
  • job:user()
Math functions
  • math:abs(<number>)
  • math:ceil(<number>)
  • math:floor (<number>)
  • math:max(<number1>, <number2>)
  • math:min(<number1>, <number2>)
  • math:round(<number>)
Pipeline functions
  • pipeline:id()
  • pipeline:name()
  • pipeline:startTime()
  • pipeline:title()
  • pipeline:user()
  • pipeline:version()
String functions
  • str:concat(<string1>, <string2>)
  • str:contains(<string>, <subset>)
  • str:endsWith(<string>, <subset>)
  • str:escapeXML10(<string>)
  • str:escapeXML11(<string>)
  • str:indexOf(<string>, <subset>)
  • str:isNullOrEmpty(<string>)
  • str:lastIndexOf(<string>, <subset>)
  • str:length(<string>)
  • str:matches(<string>, <regEx>)
  • str:regexCapture(<string>, <regEx>, <group>)
  • str:replace(<string>, <oldChar>, <newChar>)
  • str:replaceAll(<string>, <regEx>, <newString>)
  • str:split(<string>, <separator>)
  • str:splitKV(<string>, <pairSeparator>, <keyValueSeparator>)
  • str:startsWith(<string>, <subset>)
  • str:substring(<string>, <beginIndex>, <endIndex>)
  • str:toLower(<string>)
  • str:toUpper(<string>)
  • str:trim(<string>)
  • str:truncate(<string>, <length>)
  • str:unescapeJava(<string>)
  • str:unescapeXML(<string>)
  • str:urlDecode(<URL>, <charset>)
  • str:urlEncode(<infoforURL>, <charset>)
Time functions
  • time:createDateFromStringTZ(<string>, <time zone>, <date format>)
  • time:dateTimeToMilliseconds(<Date object>)
  • time:dateTimeZoneOffset(<Date object>, <time zone>)
  • time:extractDateFromString(<string>, <format string>)
  • time:extractLongFromDate(<Date object>, <format string>)
  • time:extractStringFromDate(<Date object>, <format string>)
  • time:extractStringFromDateTZ(<Date object>, <time zone>, <format string>)
  • time:millisecondsToDateTime(<long>)
  • time:now()
  • time:timeZoneOffset(<time zone>)
  • time:trimDate(<datetime>)
  • time:trimTime(<datetime>)
Miscellaneous functions
  • runtime:availableProcessors()
  • runtime:conf(<runtime property>)
  • runtime:loadResource(<file name>, <restricted: true | false>)
  • runtime:loadResourceRaw(<file name>, <restricted: true | false>)
  • sdc:hostname()
  • sdc:id()
  • size()
  • uuid:uuid()

Base64 Functions

Use Base64 functions to encode or decode information using Base64.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The StreamSets expression language provides the following Base64 functions:

base64:decodeBytes(<string>)
Returns a decoded byte array from a Base64 encoded string.
Return type: Byte Array.
Uses the following argument:
  • string - The Base64 encoded string to decode.
For example, ${base64:decodeBytes("dGhpc2lzbXlwYXNzd29yZA==")} decodes the Base64 encoded string and returns a byte array.
base64:decodeString(<string>, <charset>)
Returns a decoded string from a Base64 encoded string using the specified character set.
Return type: String.
Uses the following arguments:
  • string - The Base64 encoded string to decode.
  • charset - The character set to use to decode the data.
For example, ${base64:decodeString("dGhpc2lzbXlwYXNzd29yZA=="), "UTF-8")} decodes the Base64 encoded string using the UTF-8 character set as a string value.
base64:encodeBytes(<byte array>, <urlSafe: true | false>)
Returns a Base64 encoded string value of the specified byte array.
Return type: String.
Uses the following arguments:
  • byte array - The byte array to encode using Base64.
  • urlSafe - When set to true, encodes the data so that it can be safely sent in a URL.
For example, ${base64:encodeBytes(<byte array>), true)} uses Base64 to encode a specified byte array such that the encoded data is URL safe.
base64:encodeString(<string>, <urlSafe: true | false>, <charset>)

Returns a Base64 encoded string value of the specified string.

Return type: String.
Uses the following arguments:
  • string - The string to encode using Base64.
  • urlSafe - When set to true, encodes the data so that it can be safely sent in a URL.
  • charset - The character set to use to encode the data.
For example, ${base64:encodeString("mycredential"), false, "UTF-8")} uses Base64 to encode mycredential using the UTF-8 character set such that the encoded data is not URL safe.

Batch Functions

Use batch functions to retrieve information about a batch when writing to most destinations.

In pipelines that have an origin configured to read from more than one table, a batch attribute stores the name of the table that the origin reads for the batch. You can use batch functions to retrieve that name and use in the name of the table or directory where the destination writes data from that batch, such as in the Directory Path property of the File or ADLS destinations or in the Table property of the Hive or JDBC destinations.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The StreamSets expression language provides the following batch functions:
batch:attribute(<attribute name>)
Returns the value of the specified batch attribute.
Uses the following argument:
  • attribute name - Name of the batch attribute.

Return type: String

For example, the following expression returns the name of the table processed, specified in the jdbc.table attribute:
${batch:attribute("jdbc.table")}
batch:attributeOrDefault(<attribute name>, <default value>)
Returns the value of the specified batch attribute. When the attribute does not exist or has no value, returns the specified default value.
Uses the following arguments:
  • attribute name - Name of the batch attribute.
  • default value - Value to return when the batch attribute does not exist or has no value.

Return type: String when returning the batch attribute value. The data type of the default value when returning the specified default value.

For example, the following expression returns the name of the table processed or returns NA if no value exists:
${batch:attribute("jdbc.table", 'NA')}
For example, you might want the File destination to write data in directories named after the table where the data originated. To write to the /Transformer/output/<table name> directory in your local file system, include the following expression in the Directory Path property:
file:///Transformer/output/${batch:attribute("jdbc.table")}

Credential Functions

Credential functions provide access to sensitive information, such as user names and passwords, that is secured in a credential store. Use credential functions in pipeline and stage properties to enable Transformer to access external systems without exposing those values.

Before you use a credential function, you must configure Transformer to use one of the supported credential stores.

You can use credential functions in any property that displays a key icon next to the property name, as follows:

You cannot use credential functions in all stages. For example, StreamSets intentionally does not allow the use of credential functions in the Spark SQL Expression processor. If credential functions were allowed in stages such as the Spark SQL Expression processor, any user with access to the pipeline could access or print sensitive values, compromising the security of the external system.
Important: When you use a credential function in a stage or pipeline property, the function must be the only value defined in the property.
You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.
The StreamSets expression language provides the following credential functions:
credential:get(<cstoreId>, <userGroup>, <name>)
Returns the secret from the credential store. Uses the following arguments:
  • cstoreId - Unique ID of the credential store to use. Use the ID specified in the $TRANSFORMER_CONF/credential-stores.properties file. For more information, see Enabling Credential Stores.
  • userGroup - Group that a user must belong to in order to access the secret. Only users that have execute permission on the pipeline and that belong to this group can validate, preview, or run the pipeline that retrieves the secret.

    If working with Control Hub, specify the group using the required naming convention: <group ID>@<organization ID>.

    To grant access to all users, specify the default all group when working only with Transformer. When working with Control Hub and Transformer version 3.14.0 or later, you can specify the default group using all or all@<organization ID>. StreamSets recommends using all so that you do not need to modify credential functions when migrating pipelines from Transformer to Control Hub.
    Note: When working with Control Hub and a Transformer version earlier than 3.14.0, you must use the default all@<organization ID> group.
  • name - Name of the secret to retrieve from the credential store. Use the required format for the credential store:
    • AWS Secrets Manager - Enter the name of the secret to retrieve from Secrets Manager. Use the following format: "<name><separator><key>", where:
      • <name> is the name of the secret in Secrets Manager to read.
      • <separator> is the separator defined in the $TRANSFORMER_CONF/credential-stores.properties file.
      • <key> is the key for the value that you want returned.
    • Azure Key Vault - Enter the name of the key or secret to retrieve from Azure Key Vault.
    • CyberArk - Enter the name of the secret to retrieve from CyberArk. Use the following format:

      "<safe><separator><folder><separator><object name>[<separator><element name>]"

      • <safe> is the CyberArk safe to read. <separator> is the separator defined in the $TRANSFORMER_CONF/credential-stores.properties file.
      • <folder> is the CyberArk folder to read.
      • <object name> is the CyberArk object or secret to read.
      • <element name> is an optional name for the value that you want returned.

        If you do not specify <element name>, Transformer uses Content.

    • Google Secret Manager - Enter the secret name using the following format:

      "<name><delimiter><version ID>"

      • <name> is the secret name.
      • <delimiter> is the delimiter defined in the $TRANSFORMER_CONF/credential-stores.properties file.
      • <version ID> is the version of the secret to return.
    • Hashicorp Vault - Enter the secret name using the following format:

      "<path><separator><key>"

      • <path> is the path in Vault to read.
      • <separator> is the separator defined in the $TRANSFORMER_CONF/credential-stores.properties file.
      • <key> is the key for the value that you want returned.
    • Java keystore - Enter the name of the secret added to the Java keystore file using the jks-cs add command.
Return type: String.
AWS Secrets Manager example: The following expression returns the value from the key SQLk1 of the secret SQLpassword from the awsdev credential store. Note that the expression uses an ampersand (&) as the separator argument because that is how the separator is defined in the $TRANSFORMER_CONF/credential-stores.properties file. The expression allows any user in the devops group to access the key when validating, previewing, or running the pipeline:
${credential:get("awsdev", "devops@MyCompany", "SQLpassword&SQLk1")}
JKS example: The following expression returns the value of the OracleDBPassword secret defined in the devjks credential store and allows any user belonging to the devops group access to the secret when validating, previewing, or running the pipeline:
${credential:get("devjks", "devops@MyCompany", "OracleDBPassword")}
credential:getWithOptions(<cstoreId>, <userGroup>, <name>, <storeOptions>)
Returns the secret from the credential store using additional options to communicate with the credential store. Not applicable for the Java keystore or Google Secret Manager credential stores.
For example, you might use this function with Azure Key Vault to specify a different vault URL to use.
Uses the following arguments:
  • cstoreId - Unique ID of the credential store to use. Use the ID specified in the $TRANSFORMER_CONF/credential-stores.properties file. For more information, see Enabling Credential Stores.
  • userGroup - Group that a user must belong to in order to access the secret. Only users that have execute permission on the pipeline and that belong to this group can validate, preview, or run the pipeline that retrieves the secret.

    If working with Control Hub, specify the group using the required naming convention: <group ID>@<organization ID>.

    To grant access to all users, specify the default all group when working only with Transformer. When working with Control Hub and Transformer version 3.14.0 or later, you can specify the default group using all or all@<organization ID>. StreamSets recommends using all so that you do not need to modify credential functions when migrating pipelines from Transformer to Control Hub.
    Note: When working with Control Hub and a Transformer version earlier than 3.14.0, you must use the default all@<organization ID> group.
  • name - Name of the secret to retrieve from the credential store. Use the required format for the credential store:
    • AWS Secrets Manager - Enter the name of the secret to retrieve from Secrets Manager. Use the following format: "<name><separator><key>", where:
      • <name> is the name of the secret in Secrets Manager to read.
      • <separator> is the separator defined in either the $TRANSFORMER_CONF/credential-stores.properties file or using the separator option, below.
      • <key> is the key for the value that you want returned.
    • Azure Key Vault - Enter the name of the key or secret to retrieve from Azure Key Vault.
    • CyberArk - Enter the name of the secret to retrieve from CyberArk. Use the following format:

      "<safe><separator><folder><separator><object name>[<separator><element name>]"

      • <safe> is the CyberArk safe to read. <separator> is the separator defined in the $TRANSFORMER_CONF/credential-stores.properties file.
      • <folder> is the CyberArk folder to read.
      • <object name> is the CyberArk object or secret to read.
      • <element name> is an optional name for the value that you want returned.

        If you do not specify <element name>, Transformer uses Content.

    • Hashicorp Vault - Enter the secret name using the following format:

      "<path><separator><key>"

      • <path> is the path in Vault to read.
      • <separator> is the separator defined in the $TRANSFORMER_CONF/credential-stores.properties file.
      • <key> is the key for the value that you want returned.
  • storeOptions - Additional options to communicate with the credential store.
    For AWS Secrets Manager, you can use the following options to override several properties in the $TRANSFORMER_CONF/credential-stores.properties file:
    • separator - Specifies the separator for name and key values in the credential functions, overriding the credentialStore.<cstore ID>.config.nameKey.separator property.
    • alwaysRefresh - When set to true, forces the key to refresh its cached value before Transformer retrieves the value, overriding the credentialStore.<cstore ID>.config.cache.ttl.millis property. Be aware that always refreshing the cached value significantly increases the pipeline run time.
    For Azure Key Vault, you can use the following options to override several properties in the $TRANSFORMER_CONF/credential-stores.properties file:
    • url - Overrides the credentialStore.<cstore ID>.config.vault.url property.
    • retry - Overrides the credentialStore.<cstore ID>.config.credential.retry.millis property.
    • refresh - Overrides the credentialStore.<cstore ID>.config.credential.refresh.millis property.

    For CyberArk, you can use the following options:

    • separator - Separator to use for the secret name.
    • ConnectionTimeout - Connection timeout value.
    • FailRequestOnPasswordChange - Whether to fail the request on a password change, set to true or false.

    For Hashicorp Vault, you can use the delay option to enter a delay in milliseconds to allow time for external processing. Use the delay option when using the Vault AWS secret backend to generate AWS access credentials based on IAM policies. According to Vault documentation, you might need a delay of 10 seconds or more before the credentials can be used successfully.

    Use the following format to specify options:
    "<option1>=<value>,<option2>=<value>"
    For example, to set the Azure Key Vault retry property to 1000, enter the following for the options argument:
    "retry=1000"
Return type: String.
AWS Secrets Manager example: The following expression returns the value from the key SQLk1 of the secret SQLpassword from the awsdev credential store, overriding the separator defined in the $TRANSFORMER_CONF/credential-stores.properties file with a pipe ( | ). The expression allows any user in the devops group to access the key when validating, previewing, or running the pipeline:
${credential:getWithOptions("awsdev", "devops@MyCompany", "SQLpassword|SQLk1", "separator=|")}
Azure example: The following expression returns the value stored in the DevOpsGen2Pw secret from the azureprod credential store and caches it for two seconds. The expression allows any user belonging to the devops group access to the secret when validating, previewing, or running the pipeline:
${credential:getWithOptions("azureprod", "devops@MyCompany", "DevOpsGen2Pw", "refresh=2000")}

File Functions

Use file functions to return information about a file name or path. For example, you might use a file function to remove a file extension from a file path or to return part of the path.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The StreamSets expression language provides the following file functions:
file:fileExtension(<filepath>)
Returns the file extension from a file path. Uses the following argument:
  • filepath - An absolute path to a file.
Return type: String.
For example, the following expression returns txt:
${file:fileExtension('/logs/weblog.txt')}
file:fileName(<filepath>)
Returns the file name from a file path. Uses the following argument:
  • filepath - An absolute path to a file.
Return type: String.
For example, the following expression returns the file name, weblog.txt:
${file:fileName('/logs/weblog.txt')}
file:parentPath(<filepath>)
When used with a path to a file, returns the path to the file without the final separator, such as /files for /files/file.log.
When used with a path to a directory, returns the path to the directory without the final separator, such as /serverA/logs for /serverA/logs/2016.
Uses the following argument:
  • filepath - An absolute path to a file.
Return type: String.
For example, the following expression, which includes a path to a file, returns /serverB/logs:
${file:parentPath('/serverB/logs/weblog.txt')}
Similarly, the following expression, which includes a path to a directory, returns the parent directory, /serverB/logs:
${file:parentPath('/serverB/logs/weblogs')}
file:pathElement(<filepath>, <integer>)
Returns the part of a path based on the specified integer. Uses the following arguments:
  • filepath - An absolute path to a file.
  • integer - The section of a path to return. Can return parts starting from the left or right side of the path:
    • To return a section of a path, counting from the left side of the path, use 0 and positive integers and start with 0.
    • To return a section of a path, counting from the right side of the path, use negative integers and start with -1.
Return type: String.
For example, to return the logs portion of the path, you can use either of the following expressions:
${file:pathElement('/logs/weblog.txt',0)}
${file:pathElement('/logs/weblog.txt',-2)}
file:removeExtension(<filepath>)
Returns the file path without the file extension. Uses the following argument:
  • filepath - An absolute path to a file.
Return type: String.
For example, the following expression returns /logs/weblog:
${file:removeExtension('/logs/weblog.txt')}

Job Functions

Use job functions to return information about a Control Hub job. For example, you might use a job function to return the name of the job running a pipeline.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The StreamSets expression language includes the following job functions:
job:id()
Returns the ID of the job if the pipeline was run from a Control Hub job. Otherwise, returns UNDEFINED.
Return type: String.
job:name()
Returns the name of the job if the pipeline was run from a Control Hub job. Otherwise, returns UNDEFINED.
Return type: String.
job:startTime()
Returns the start time of the job if the pipeline was run from a Control Hub job. Otherwise, returns the start time of the pipeline.
Return type: Datetime.
job:user()
Returns the user who started the job if the pipeline was run from a Control Hub job. Otherwise, returns UNDEFINED.
Return type: String.

Math Functions

Use math functions to perform math on numeric values.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

You can use the following data types with math functions:
  • Double
  • Float
  • Integer
  • Long
  • String
The StreamSets expression language provides the following math functions:
math:abs(<number>)
Returns the absolute value, or positive version, of the argument. If the argument is already positive, returns the original number.
Return type: Double, Float, Int, or Long, based on the data type of the argument.
math:ceil(<number>)
Returns the smallest integer greater than or equal to the argument.
For example, ${math:ceil(8.0+3.6)} returns 12.0.
Return type: Double.
math:floor (<number>)
Returns the largest integer greater than or equal to the argument.
Return type: Double.
math:max(<number1>, <number2>)
Returns the greater of two arguments.
Return type: Double, Float, Int, or Long, based on the data type of the argument.
math:min(<number1>, <number2>)
Returns the lesser of two arguments.
Return type: Double, Float, Int, or Long, based on the data type of the argument.
math:round(<number>)
Returns the closest number to the argument, rounding up for ties.
Return type: Double or Long.

Pipeline Functions

Use pipeline functions to determine information about a pipeline, such as the pipeline title or ID. The StreamSets expression language provides the following pipeline functions:

pipeline:id()
Returns the ID of the pipeline. The ID is a UUID automatically generated when the pipeline is created and is used by Transformer to identify the pipeline. The pipeline ID cannot be changed.
Return type: String.
For example, if you want to use the pipeline ID with the pipeline name as the name of the Spark application for the pipeline, you might specify the following expression in the pipeline Application Name property:
${pipeline:title()-pipeline:id()}
pipeline:name()
Like pipeline:id, this function returns the ID of the pipeline. The ID is a UUID automatically generated when the pipeline is created and is used by Transformer to identify the pipeline. The pipeline ID cannot be changed.
Return type: String.
pipeline:startTime()
Returns the start time of the pipeline.

Return type: Datetime.

pipeline:title()
Returns the title or name of the pipeline.
Return type: String.
pipeline:user()
Returns the user who started the pipeline.
Return type: String.
pipeline:version()
Returns the pipeline version when the pipeline has been published to StreamSets Control Hub. Returns UNDEFINED if the pipeline has not been published to Control Hub. Use this function only when you have registered Transformer to work with Control Hub.
Return type: String.

String Functions

Use string functions to transform string data.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The StreamSets expression language provides the following string functions:

str:concat(<string1>, <string2>)
Concatenates two strings together.
Uses the following arguments:
  • string1 - The first string to concatenate.
  • string2 - The second string to concatenate.
Return type: String.
str:contains(<string>, <subset>)
Returns true or false based on whether the string contains the configured subset of characters.
Uses the following arguments:
  • string - The string to evaluate.
  • subset - The subset of characters to look for.
For example, ${str:contains("Jane", "boo")} returns: false.
str:endsWith(<string>, <subset>)
Returns true or false based on whether the string ends with the configured subset of characters.
Uses the following arguments:
  • string - The string to evaluate.
  • subset - The subset of characters to look for.
For example, ${str:endsWith("32403-1001", "1001")} returns: true.
str:escapeXML10(<string>)
Returns a string that you can embed in an XML 1.0 or 1.1 document.
Uses the following argument:
  • string - The string to escape.
Return type: String.
str:escapeXML11(<string>)
Returns a string that you can embed in an XML 1.1 document.
Uses the following argument:
  • string - The string to escape.
Return type: String.
str:indexOf(<string>, <subset>)
Returns the index within a string of the first occurrence of the specified subset of characters.
Uses the following arguments:
  • string - The string to return the index of.
  • subset - The subset of characters to look for.
Return type: Integer.
For example, ${str:indexOf("pepper", "pe")} returns: 0.
str:isNullOrEmpty(<string>)
Returns true or false based on whether a string is null or is the empty string.
Uses the following argument:
  • string - The string to evaluate.
str:lastIndexOf(<string>, <subset>)
Returns the index within a string of the last occurrence of the specified subset of characters.
Uses the following arguments:
  • string - The string to return the index of.
  • subset - The subset of characters to look for.
Return type: Integer.
For example, ${str:lastIndexOf("pepper", "pe")} returns: 3.
str:length(<string>)
Returns the length of a string.
Uses the following argument:
  • string - The string to return the length for.
Return type: Integer.
For example, ${str:length("tomorrow")} returns: 8.
str:matches(<string>, <regEx>)
Returns true or false based on whether a string matches a Java regex pattern.
Uses the following arguments:
  • string - The string to evaluate.
  • regEx - Regular expression that describes the pattern of the string.
str:regExCapture(<string>, <regEx>, <group>)
Parses a complex string into groups based on a Java regex pattern and returns the specified group.
Uses the following arguments:
  • string - The string that contains a pattern of characters.
  • regEx - Regular expression that describes the pattern of the string, separating it into groups. Use the backslash as an escape character for special characters in the expression. For example, to represent a digit in the expression with the characters \d, use \\d.
  • group - The number of the group to return, where 1 represents the first group, 2 represents the second group, etc. 0 returns the entire string.
Return type: String.
str:replace(<string>, <oldChar>, <newChar>)
Replaces all instances of a specified character in a string with a new character.
Uses the following arguments:
  • string - The string that contains the character to replace.
  • oldChar - Character to replace.
  • newChar - Character to use for replacement.
Return type: String.
For example, ${str:replace("lecucereche", "e", "a")} returns: lacucaracha.
str:replaceAll(<string>, <regEx>, <newString>)
Replaces a set of characters in a string with a new set of characters.
Uses the following arguments:
  • string - The string that contains the group of characters to replace.
  • regEx - A regular expression that describes the string to replace.
  • newString - The set of characters to use for replacement.
Return type: String.

For example, ${str:replaceAll("shoes_sandals","^shoes","footwear")} returns: footwear_sandals.

str:split(<string>, <separator>)
Splits a string into a list of strings based on the specified separator.
Uses the following arguments:
  • string - An input string.
  • separator - The set of characters that designate a string split.
Return type: String.
str:splitKV(<string>, <pairSeparator>, <keyValueSeparator>)
Splits key-value pairs in a string into a map of string values.
Uses the following arguments:
  • string - The string containing the key-value pairs.
  • pairSeparator - The set of characters that separate the key-value pairs.
  • keyValueSeparator - The set of characters that separate each key and value.
str:startsWith(<string>, <subset>)
Returns true or false based on whether the string starts with the configured subset of characters.
Uses the following arguments:
  • string - The string to evaluate.
  • subset - The subset of characters to look for.
Return type: String.
For example, ${str:startsWith("Transformer", "Trans")} returns: true.
str:substring(<string>, <beginIndex>, <endIndex>)
Returns a subset of the string value that starts with the beginIndex character and ends one character before the endIndex.
Uses the following arguments:
  • string - The string that contains the return substring that you want.
  • beginIndex - An integer that represents the beginning position of the returned substring. Start the count from the left with 0.
  • endIndex - An integer that represents one position past the end of the substring.
Return type: String.
For example, ${str:substring("Chewing Gum", 0, 5)} returns: Chew.
str:toLower(<string>)
Converts string data to all lowercase letters.
Uses the following argument:
  • string - The string to lower case.
Return type: String.
For example, ${str:toLower("FALSE")} returns: false.
str:toUpper(<string>)
Converts string data to all capital letters.
Uses the following argument:
  • string - The string to capitalize.
Return type: String.
For example, ${str:toUpper("true")} returns: TRUE.
str:trim(<string>)
Trims leading and trailing white space characters from a string, including spaces and return characters.
Uses the following argument:
  • string - The string to return without additional white space characters.
Return type: String.
str:truncate(<string>, <length>)
Returns a string truncated to the specified length. Use an integer to specify the length.
Uses the following arguments:
  • string - The string to truncate.
  • length - An integer that represents the number of characters to keep.
Return type: String.
str:unescapeJava(<string>)
Returns an unescaped string from a string with special Java characters. Use to include binary or non-printable characters in any location where you can enter an expression.
Uses the following argument:
  • string - The string to process.
Return type: String.
str:unescapeXML(<string>)
Returns an unescaped string from a string that had XML data escaped.
Uses the following argument:
  • string - The string that includes escaped XML data.
Return type: String.
str:urlDecode(<URL>, <charset>)
Converts characters from a URL to the specified character set, such as UTF-8.
Uses the following arguments:
Return type: String.
For example, to convert a URL to UTF-8, use the following expression:
${str:urlDecode(record:value('/URL'), 'UTF-8')}
str:urlEncode(<infoforURL>, <charset>)
Converts invalid characters to help create a valid URL based on the specified character set, such as UTF-8. You might use this function when using record data to add additional information, like a fragment, to a URL.
Uses the following arguments:
Return type: String.

Time Functions

Use time functions to return the current time or to transform datetime information.

You can replace any datetime argument with an expression that evaluates to a datetime value. You cannot replace a datetime argument with a datetime literal.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The StreamSets expression language provides the following time functions:

time:createDateFromStringTZ(<string>, <time zone>, <date format>)
Creates a Date object based on a datetime in a String field and using the specified time zone. The datetime string should not include the time zone.
Uses the following arguments:
  • string - String with datetime values, not including the time zone.
  • time zone - Time zone associated with the datetime values. The time zone is used when creating the Date object.
    You can use the following time zone formats:
    • <area>/<location> - For example, America/Chicago or Europe/Madrid.
    • Numeric time zones with the GMT prefix, such as GMT-0500 or GMT-8:00. Note that numeric-only time zones such as -500 are not supported.
    • Short time zone IDs such as EST and CST - These time zones should generally be avoided because they can stand for multiple time zones, e.g. CST stands for both Central Standard Time and China Standard Time.
  • date format - The date format used by the string data. For information about date formats, see https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.
time:dateTimeToMilliseconds(<Date object>)

Converts a Date object to an epoch or UNIX time in milliseconds.

For example, the following expression converts the current time to epoch or UNIX time in seconds, and then multiplies the value by 1000 to convert the value to milliseconds:
${time:dateTimeToMilliseconds(time:now())}

Return type: Long.

time:dateTimeZoneOffset(<Date object>, <time zone>)

Returns the time zone offset in milliseconds for the specified date and time zone. The time zone offset is the difference in hours and minutes from Coordinated Universal Time (UTC).

Uses the following arguments:
  • Date object - Date object to use.
  • time zone - Time zone associated with the Date object.
    You can use the following time zone formats:
    • <area>/<location> - For example, America/Chicago or Europe/Madrid.
    • Numeric time zones with the GMT prefix, such as GMT-0500 or GMT-8:00. Note that numeric-only time zones such as -500 are not supported.
    • Short time zone IDs such as EST and CST - These time zones should generally be avoided because they can stand for multiple time zones, e.g. CST stands for both Central Standard Time and China Standard Time.
time:extractDateFromString(<string>, <format string>)

Extracts a Date object from a String, based on the specified date format.

Uses the following arguments:
For example, the following expression converts the string 2017-05-01 20:15:30.915 to a Date object:
${time:extractDateFromString('2017-05-01 20:15:30.915','yyyy-MM-dd HH:mm:ss.SSS')}
Return type: Date object.
time:extractLongFromDate(<Date object>, <format string>)
Extracts a long value from a Date object, based on the specified date format.
Uses the following arguments:
Return type: Long.
Note: Because the function returns a long value, you cannot specify non-numeric data in the data format string. For example, the date format "MMM" returns the three character abbreviation for the month (such as "Jul"), which causes the function to return incorrect results.
time:extractStringFromDate(<Date object>, <format string>)
Extracts a string value from a Date object based on the specified date format.
Uses the following arguments:
Return type: String.
time:extractStringFromDateTZ(<Date object>, <time zone>, <format string>)
Extracts a string value from a Date object, converting the GMT time in the Date object to the specified date format and time zone. The function adjusts for daylight savings when given the time zone in the appropriate format.
Uses the following arguments:
  • Date object - Date object to use.
  • time zone - Time zone to use in the conversion.

    To convert the time zone and adjust for daylight savings, use the <Area>/<Location> format, such as America/Mexico_City. See the following link for a list of valid time zones in this format: https://www.vmware.com/support/developer/vc-sdk/visdk400pubs/ReferenceGuide/timezone.html.

    Short time zone IDs such as EST and CST return data, but note that these time zones do not adjust for daylight savings, and should be avoided because they can stand for multiple time zones, e.g. CST stands for both Central Standard Time and China Standard Time.

    You can use numeric time zones with the GMT prefix, such as GMT-0500 or GMT-8:00, but these time zones also do not account for daylight savings. Note that numeric-only time zones such as -500 are not supported.

  • format string - String that specifies the format to use to express the date. For information about creating a date format, see https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.
Return type: String.
Returns an empty string when the time zone is not specified or invalid.
time:millisecondsToDateTime(<long>)
Converts an epoch or UNIX time in milliseconds to a Date object.
If the epoch or UNIX time is in seconds, multiply the value by 1000 to produce a value in the milliseconds range.
Return type: Date object.
time:now()
Returns the current time of the Transformer machine as a java.util.Date object.
Return type: Datetime.
time:timeZoneOffset(<time zone>)

Returns the time zone offset in milliseconds for the specified time zone. The time zone offset is the difference in hours and minutes from Coordinated Universal Time (UTC).

Uses the following argument:
  • time zone - Time zone to use.
    You can use the following time zone formats:
    • <area>/<location> - For example, America/Chicago or Europe/Madrid.
    • Numeric time zones with the GMT prefix, such as GMT-0500 or GMT-8:00. Note that numeric-only time zones such as -500 are not supported.
    • Short time zone IDs such as EST and CST - These time zones should generally be avoided because they can stand for multiple time zones, e.g. CST stands for both Central Standard Time and China Standard Time.

Return type: Long.

time:trimDate(<datetime>)
Trims the date portion of a datetime value by setting the date portion to January 1, 1970.
For example, if the current time of the Transformer machine is Jul 25, 2016 5:18:05 PM, then ${time.trimDate(time:now())} returns: Jan 1, 1970 5:18:05 PM.
Return type: Datetime.
time:trimTime(<datetime>)
Trims the time portion of a datetime value by setting the time portion to 00:00:00.
Return type: Datetime.

Miscellaneous Functions

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The StreamSets expression language provides the following miscellaneous functions:

runtime:availableProcessors()

Returns the number of processors available to the Java virtual machine. You can use this function when you want to configure multithreaded processing based on the number of processors available to Transformer.

Return type: Integer.

runtime:conf(<runtime property name>)
Returns the value for the specified runtime property. Use to call a runtime property.
Uses the following argument:
  • runtime property name - Name of the runtime property to call. The property must be defined in the Transformer configuration file or in a separate runtime properties file.
For example, the following expression returns the value of the ADLS_SAcct runtime property:
${runtime:conf('ADLS_SAcct')}
runtime:loadResource(<file name>, <restricted: true | false>)
Returns the value in the specified file, trimming any leading or trailing whitespace characters from the file. Use to call a runtime resource.
Uses the following arguments:
  • file name - Name of the file that contains the information to be loaded. The file must reside in the $TRANSFORMER_RESOURCES directory.
  • restricted - Whether the file has restricted permissions. If set to true, the file must be owned by the system user who runs the Transformer and read and writable only by the owner.
For example, the following expression returns the contents of the restricted JDBCpassword.txt file, trimming any leading or trailing whitespace characters:
${runtime:loadResource("JDBCpassword.txt", true)}
runtime:loadResourceRaw(<file name>, <restricted: true | false>)
Returns the entire contents in the specified file, including any leading or trailing whitespace characters in the file. Use to call a runtime resource.
Uses the following arguments:
  • file name - Name of the file that contains the information to be loaded. The file must reside in the $TRANSFORMER_RESOURCES directory.
  • restricted - Whether the file has restricted permissions. If set to true, the file must be owned by the system user who runs the Transformer and read and writable only by the owner.
For example, the following expression returns the entire contents of the restricted JDBCpassword.txt file, including any leading or trailing whitespace characters:
${runtime:loadResourceRaw("JDBCpassword.txt", true)}
sdc:hostname()
Returns the host name of the Transformer machine.
Return type: String.
sdc:id()
Returns the Transformer ID.

For a pipeline that runs in standalone execution mode, the ID is a unique identifier associated with the Transformer, such as 58efbb7c-faf4-4d8e-a056-f38667e325d0. The ID is stored in the following file: $TRANSFORMER_DATA/transformer.id.

For a pipeline that runs in cluster mode, the ID is the Transformer worker partition ID generated by a cluster application, such as Spark or MapReduce.

size()
Returns the size of a map.
Return type: Integer.
uuid:uuid()
Returns a randomly generated UUID.
This function uses a lot of entropy on Linux systems and can cause your entropy pools to run dry. When this happens, your pipelines slow to a halt but continue to run. Throughput effectively goes to zero while the system waits for entropy to again become available. As a best practice, we recommend running the haveged daemon on any Transformer machine where you use this function. The haveged daemon regenerates your entropy pools.
Return type: String.