User-Defined Functions
A Snowflake user-defined function (UDF) is a custom function that you can create and then use like any Snowflake system-defined function.
You can call any precompiled UDF in your Snowflake account from a pipeline. When you run a pipeline that includes a precompiled UDF, Transformer for Snowflake includes the UDF as part of pipeline logic, as expected.
- Permanent inline UDF
- Define a permanent inline UDF when you plan to call the UDF in additional pipelines.
- Temporary inline UDF
- Define a temporary inline UDF when you plan to call the UDF in the current pipeline only.
Transformer for Snowflake does not support defining other types of UDFs in pipelines at this time.
Use a UDF to perform tasks that are not easily done with existing Snowflake system-defined functions. For example, you might use a UDF to convert Epoch values to Timestamp.
For more information about UDFs, Java UDFs, and inline Java UDFs, including several examples, see the Snowflake documentation.
Define the UDF
- Precompiled UDF - Define and compile the UDF in Snowflake. For more information,
see the Snowflake documentation.
You can use precompiled UDFs in SQL queries and SQL expressions as long as the UDFs are compiled and available in your Snowflake account.
- Inline UDF - Define a permanent or temporary UDF in pipeline properties. Transformer for Snowflake compiles the UDF when you run the pipeline.
Once compiled, a permanent inline UDF is stored in your Snowflake account and available for other pipelines to call as a precompiled UDF.
Inline UDF Guidelines
- Create a class definition for an inline Java scalar UDF.
- Use a static handler method rather than an instance method.
- Verify that the UDF works as expected in Snowflake before using it in a pipeline.
Define an Inline UDF
To use an inline UDF, define the UDF in pipeline properties. After you define the UDF, you can call it from any valid location in the pipeline.
Before defining an inline UDF, make sure that the code performs as expected in Snowflake. Transformer for Snowflake does not validate UDF logic.
- In the pipeline properties, click the Advanced tab.
-
In the UDF Configs section, configure the following
properties:
UDF Property Description UDF Name Name to use when calling the UDF from the pipeline. Java Handler Function Static Java handler function for the UDF. Temporary Function When selected, creates a temporary UDF that is available for the duration of the pipeline run. When cleared, creates a permanent UDF that is stored in your Snowflake account and available for other pipelines to call as a precompiled UDF.
Target Path for Jar Path to a location in your Snowflake account to store the compiled jar file. Do not include a jar file name. Default is
@~/
.Java UDF Class Definition Java UDF class definition. For suggested design guidelines, see Inline UDF Guidelines.
- To define another UDF, click Add Another.
create or replace function concat_varchar_2(a ARRAY)
returns varchar
language java
handler='TestFunc_2.concat_varchar_2'
target_path='@~/TestFunc_2.jar'
as
$$
class TestFunc_2 {
public static String concat_varchar_2(String[] string_array) {
return String.join(" ", string_array);
}
}
$$;
Note the handler function, target path, and the class definition between the two sets of dollar signs ($$).
UDF Property | Value |
---|---|
UDF Name | concatArray – This can be any string that you want. |
Java Handler Function | TestFunc_2.concat_varchar_2 |
Temporary Function | Select to create a temporary inline UDF. |
Target Path for Jar |
@~/
|
Java UDF Class Definition |
|
Call the UDF
You can call an inline or precompiled Snowflake UDF from any property where you can specify a Snowflake SQL expression and invoke a column name.
- Apply Functions processor
- Column Transformer processor
- Snowflake SQL Query origin or processor
- To perform custom calculations in an Aggregate processor
- As part of a condition for a processor, such as the Filter, Join, or Stream Selector processors
- As part of an additional predicate in a Snowflake Table origin
To call a UDF, you simply use the specified UDF name like a Snowflake-defined function in your SQL statement.
address
map column is not
null:concatArray(address) is not null
Note that if you had the
same UDF logic in a precompiled UDF in your Snowflake account, you could skip defining
the UDF in pipeline properties and use the same condition in the Filter processor to
call the precompiled UDF.