Filter

The Filter processor passes rows that match the filter condition to downstream stages. Rows that do not match the filter condition are removed from the pipeline.

Use the Filter processor to remove unwanted rows from the pipeline. To route data to separate downstream branches based on different conditions, use the Stream Selector processor.

When you configure the Filter processor, you specify the filter condition to use.

Filter Condition

The filter condition determines the data that passes downstream. The filter condition must evaluate to true or false. Rows that evaluate to true pass to the rest of the pipeline.

You can use a condition as simple as AccountID is NOT NULL or you can create as complex a condition as needed.

Here are some guidelines for filter conditions:

  • When you define a condition, you typically base it on column values in the row.

    For information about referencing columns in the condition, see Referencing Columns in Snowflake SQL Expressions.

  • You can use any syntax that can be used in the WHERE clause of a query, including functions such as isnull or trim and operators such as = or <=.

    For more information about the WHERE clause, see the Snowflake documentation.

  • Do not include WHERE in the condition.
  • You can also use user-defined functions (UDFs) in the condition.

For example, the following condition passes only rows where the year of the transaction date value is 2000 or later:
year(transaction_date) >= 2000

Sample Conditions

The following table lists some common scenarios that you might adapt for your use:
Condition Example Description
total > 0 If the value in the total column is greater than 0, the row passes downstream. If not, the row is dropped from the pipeline.
total <= 0 If the value in the total column is less than or equal to 0, the row passes downstream. If not, the row is dropped from the pipeline.
accountId is NOT NULL If the row has a value in the accountId column, the row passes downstream. If the column contains a null value, the row is dropped from the pipeline.

Note that NULL is not case sensitive. For example, you can alternatively use null or Null in the condition.

upper(message) like '%ERROR%' If the message column contains the string, ERROR, the row passes downstream. If not, the row is dropped from the pipeline.

The condition changes the strings in the message column to uppercase before performing the evaluation. This allows the condition to also apply to error and Error, for example.

initcap(country) like 'China' OR initcap(country) like 'Japan' If the value in the country column is China or Japan, the row passes downstream. If not, the row is dropped from the pipeline.

The condition changes the strings in the country column to capitalize the first letter before performing the evaluation. This allows the condition to also apply to CHINA and japan, for example.

Configuring a Filter Processor

Configure a Filter processor to allow only the rows that match a specified condition to pass downstream.

  1. On the General tab, configure the following properties:
    General Property Description
    Name Stage name.
    Description Optional description.
    Cache Data Caches processed data.
  2. On the Filter tab, specify the filter condition to use.