User-Defined Functions

A Snowflake user-defined function (UDF) is a custom function that you can create and then use like any Snowflake system-defined function.

You can call any precompiled UDF in your Snowflake account from a pipeline. When you run a pipeline that includes a precompiled UDF, Transformer for Snowflake includes the UDF as part of pipeline logic, as expected.

You can also define the following types of inline Java scalar UDFs in a pipeline and then call them in the pipeline:
Permanent inline UDF
Define a permanent inline UDF when you plan to call the UDF in additional pipelines.
When you run a pipeline that defines a permanent inline UDF, Snowflake compiles the UDF using the configured pipeline properties, stores the compiled jar file in the specified location in your Snowflake account, and permanently saves the compiled UDF. Then, the UDF logic is used wherever the pipeline calls it.
Because Snowflake saves a permanent compiled UDF, you can then call the UDF from other pipelines just as you call other precompiled UDFs.
The compiled UDF is not available to other Snowflake users by default.
Temporary inline UDF
Define a temporary inline UDF when you plan to call the UDF in the current pipeline only.
When you run a pipeline that defines a temporary inline UDF, Snowflake compiles the UDF using the configured pipeline properties, stores the compiled jar file in the specified location in your Snowflake account, and saves the compiled UDF for the current pipeline run only. When the pipeline finishes, Snowflake deletes the temporary UDF.

Transformer for Snowflake does not support defining other types of UDFs in pipelines at this time.

Use a UDF to perform tasks that are not easily done with existing Snowflake system-defined functions. For example, you might use a UDF to convert Epoch values to Timestamp.

To use a UDF in a pipeline, perform the following steps:
  1. Define the UDF
  2. Call the UDF

For more information about UDFs, Java UDFs, and inline Java UDFs, including several examples, see the Snowflake documentation.

Define the UDF

You define a UDF differently depending on the type that you use:
  • Precompiled UDF - Define and compile the UDF in Snowflake. For more information, see the Snowflake documentation.

    You can use precompiled UDFs in SQL queries and expressions as long as the UDFs are compiled and available in your Snowflake account.

  • Inline UDF - Define a permanent or temporary UDF in pipeline properties. Transformer for Snowflake compiles the UDF when you run the pipeline.

    Once compiled, a permanent inline UDF is stored in your Snowflake account and available for other pipelines to call as a precompiled UDF.

Inline UDF Guidelines

Use the following guidelines when designing an inline UDF:
  • Create a class definition for an inline Java scalar UDF.
  • Use a static handler method rather than an instance method.
  • Verify that the UDF works as expected in Snowflake before using it in a pipeline.

Define an Inline UDF

To use an inline UDF, define the UDF in pipeline properties. After you define the UDF, you can call it from any valid location in the pipeline.

Note: You do not need to configure pipeline properties to use precompiled UDFs. You can use precompiled UDFs in SQL queries and expressions as long as the UDFs are compiled and available in your Snowflake account.

Before defining an inline UDF, make sure that the code performs as expected in Snowflake. Transformer for Snowflake does not validate UDF logic.

  1. In the pipeline properties, click the Advanced tab.
  2. In the UDF Configs section, configure the following properties:
    UDF Property Description
    UDF Name Name to use when calling the UDF from the pipeline.
    Java Handler Function Static Java handler function for the UDF.
    Temporary Function When selected, creates a temporary UDF that is available for the duration of the pipeline run.

    When cleared, creates a permanent UDF that is stored in your Snowflake account and available for other pipelines to call as a precompiled UDF.

    Target Path for Jar Path to a location in your Snowflake account to store the compiled jar file. Do not include a jar file name.

    Default is @~/.

    Java UDF Class Definition Java UDF class definition.

    For suggested design guidelines, see Inline UDF Guidelines.

  3. To define another UDF, click Add Another.
For example, the following inline Java scalar UDF from the Snowflake documentation concatenates string columns in an array and converts it to a single Varchar column:
create or replace function concat_varchar_2(a ARRAY)
returns varchar
language java
handler='TestFunc_2.concat_varchar_2'
target_path='@~/TestFunc_2.jar'
as
$$
    class TestFunc_2 {
        public static String concat_varchar_2(String[] string_array) {
            return String.join(" ", string_array);
        }
    }
$$;

Note the handler function, target path, and the class definition between the two sets of dollar signs ($$).

To define a similar temporary UDF in a pipeline, you specify a name for the UDF and provide the following details on the Advanced tab of the pipeline properties, as follows:
UDF Property Value
UDF Name concatArray – This can be any string that you want.
Java Handler Function TestFunc_2.concat_varchar_2
Temporary Function Select to create a temporary inline UDF.
Target Path for Jar @~/
Java UDF Class Definition
class TestFunc_2 {
    public static String concat_varchar_2(String[] string_array) {
        return String.join(" ", string_array);
    }
}

Call the UDF

You can call an inline or precompiled Snowflake UDF in any logical location in the pipeline. A logical location is a property where you can specify an expression using Snowflake SQL and invoke a column name.

Some logical locations include the following processors that are built for applying functions and expressions:
  • Apply Functions processor
  • Column Transformer processor
  • Snowflake SQL Query origin or processor
You might also use UDFs as follows:
  • To perform custom calculations in an Aggregate processor
  • As part of a condition for a processor, such as the Filter, Join, or Stream Selector processors
  • As part of an additional predicate in a Snowflake Table origin

To call a UDF, you simply use the specified UDF name like a Snowflake-defined function in your SQL statement.

For example, you might use the concatArray UDF defined above in a condition in the Filter processor to pass only rows where the result of the concatenation of the address map column is not null:
concatArray(address) is not null
Note that if you had the same UDF logic in a precompiled UDF in your Snowflake account, you could skip defining the UDF in pipeline properties and use the same condition in the Filter processor to call the precompiled UDF.