UDF

The UDF processor enables using a Snowflake scalar user-defined function (UDF) in a pipeline. You can use any UDF that is available to the role specified in the pipeline.

Important: The UDF processor does not display on the Processor tab of the Stage Selector dialog box. To add a UDF processor to a pipeline, use the Functions tab to select the Snowflake UDF that you want to use.

Selecting a Snowflake UDF in the Stage Selector dialog box creates a UDF processor that is customized based on the selected Snowflake UDF. The processor displays the name and signature of the UDF and provides properties that correspond to the arguments defined in the Snowflake UDF.

When you configure the UDF processor, you define the values for arguments. The values can be a constant, column name, or Snowflake expression. You also specify an output column for the results of the UDF.

Prerequisites

Before using the UDF processor, perform the following tasks as needed:
  1. Create a scalar Snowflake UDF.

    You can use any method available with Snowflake to create the UDF. For more information, see the Snowflake documentation.

  2. Ensure that you have a role with access to the UDF.

    Use the role to define the Role pipeline property. The pipeline uses the specified role to perform Snowflake-related tasks, including executing the UDF defined in a UDF processor. For more information, see the Snowflake documentation.

Defining Arguments

Snowflake UDFs can include required and optional arguments. All arguments defined in the selected Snowflake UDF appear as properties in the resulting UDF processor.

Argument properties are listed in the same order that the arguments are listed in the Snowflake UDF.
Tip: Consult the Snowflake UDF signature or definition when configuring the processor to ensure that you define the argument properties appropriately. Argument properties are not named in the processor at this time.
For example, the following salestax Snowflake UDF includes the sales and tax arguments:
CREATE OR REPLACE FUNCTION add_salestax
(sales NUMBER, tax NUMBER)
  RETURNS NUMBER
  AS 'sales+(sales * tax)';

To configure this UDF processor correctly, you treat the first set of argument properties in the processor like the sales argument, and the second set like the tax argument, as follows:

Working with Optional Arguments

When a Snowflake UDF includes optional arguments, those arguments appear in the UDF processor.

When you want to define an optional argument, you must define all arguments, required and optional, that are listed before the optional argument.

For example, say you create a UDF processor based on a UDF that includes one required argument and three optional arguments as follows:
CREATE OR REPLACE FUNCTION MYSCHEMA.MYUDF ("requiredArg" VARCHAR, 
"optArg1" VARCHAR DEFAULT '-', 
"optArg2" VARCHAR DEFAULT '?'),
"optArg3" VARCHAR DEFAULT '!' 
RETURNS VARCHAR

COMMENT='Concats varchar data, then initcaps'
AS
$$
    SELECT initcap( requiredArg || optArg1 || optArg2 || optArg3 )
$$;

If you want to define requiredArg and optArg2, you must also define optArg1.

In the processor, this means that you define the first three arguments. You also remove the last argument, which represents optArg3.

Validation and Testing

When you validate a pipeline that includes a UDF processor, the validation performs some basic checks based on the function signature, such as verifying that all required arguments are defined.

However, validation does not verify that the UDF can successfully run based on incoming data. To do that, try previewing the data or performing a draft run.

Configuring a UDF Processor

Configure a UDF processor to include scalar Snowflake UDF processing in the pipeline.

Unlike most processors, you create a UDF processor from the Functions tab of the Stage Selector dialog box. This creates a custom UDF processor based on the selected Snowflake UDF. The UDF processor does not appear on the Processors tab.

  1. In the pipeline canvas, click the Insert Stage icon or the Add Stage button.
  2. In the Stage Selector, click the Functions tab:
    The Stage Selector dialog box attempts to list all UDFs and stored procedures available to the role defined in the pipeline.
    To limit the results, you can specify a search string, or select a function type or schema.
  3. Select the Snowflake UDF that you want to use.
    The pipeline canvas displays a UDF processor based on the selected Snowflake UDF, providing a set of properties for every argument in the Snowflake UDF.
  4. Verify that you selected the appropriate UDF.
    The processor provides the following information from the Snowflake UDF:
    • Name
    • Signature
    • Description, when available
    Note: You cannot change the Snowflake UDF that the processor uses. If you selected the wrong Snowflake UDF, delete the stage and add a new UDF processor.
  5. In the Arguments area, define the arguments for the processor.
    For each argument, define the following properties, as needed:
    Arguments Property Description
    Value Value, column, or Snowflake expression to pass to the argument.
    Treat as Constant Treats the specified value as a constant instead of a column or Snowflake expression.
    To define an optional argument, define all arguments up to the optional argument.
    Important: Argument properties are listed in the order that they are defined in the Snowflake UDF. To ensure that you define the properties appropriately, consult the Snowflake UDF signature or definition.
  6. Remove any optional arguments that are not defined.
  7. In the Output Column property, specify a column name for the results of the UDF.
    If the column exists, the processor overwrites the existing data. Otherwise, the processor creates the output column.