Sort
The Sort processor organizes rows based on the specified sort columns and sort order. The processor passes all incoming rows downstream with an updated row order.
For example, to place the transactions in the order that they occurred, you might sort transaction data by the transaction timestamp using ascending order.
When you configure the Sort processor, you specify the columns to sort by and the sort order for each column. The processor provides the following sort orders: ascending, descending, ascending with null values first, and descending with null values last.
If you specify more than one sort column, the processor sorts rows using the first listed column, then within those results, sorts using the second listed column, and so on.
The Sort processor performs a Snowflake Order By
to sort rows. For
details on how Snowflake orders data, see the Snowflake documentation.
Example
Say you have the following sales data, where store 3 in Barcelona has no sales information because it was closed for renovations:
Store_ID | City | Sales |
---|---|---|
1 | New York | 22050 |
2 | Madrid | 18330 |
3 | Barcelona | |
4 | Barcelona | 34035 |
5 | Washington D.C. | 5450 |
6 | Washington D.C. | 6300 |
- City - Ascending
- Sales - Descending, Nulls Last
Store_ID | City | Sales |
---|---|---|
4 | Barcelona | 34035 |
3 | Barcelona | |
2 | Madrid | 18330 |
1 | New York | 22050 |
6 | Washington D.C. | 6300 |
5 | Washington D.C. | 5450 |
The processor sorts first by City
and then by
Sales
, since that was the specified order. Rows are sorted by city
in ascending order. Then, within records of the same city, the processor sorts rows
by sales data in descending order with the null value last.
Notice how the Barcelona rows are ordered with the null value last, and the Washington D.C. rows have the lower value last.
Configuring a Sort Processor
Configure a Sort processor to change the order of data based on the specified columns and sort orders.
-
On the General tab, configure the following
properties:
General Property Description Name Stage name. Description Optional description. Cache Data Caches processed data. -
On the Sort tab, configure the following properties:
Sort Property Description Sort Columns Specify the following information for each column that you want to sort by: - Column - Name of the column to sort by.
- Order - Sort order to use. Specify one of the
following options:
- Ascending
- Descending
- Ascending - Nulls First
- Descending - Nulls Last
The processor sorts all rows using the first listed column, then sorts within those results using the second listed column, and so on. For an example, see Example.