Sort
The Sort processor sorts incoming data based on one or more specified fields. The processor can sort data in ascending or descending order.
For example, let's say that you create a batch pipeline to read all available data in the
orders
table in a relational database, transform the data, and then
write the data to a destination system. Before writing the data, you want the pipeline
to sort all records by the order ID. To do this, you add a Sort processor before the
destination, and configure the processor to sort by the order_id
field
in ascending order.
Sort by Multiple Fields
When you sort by multiple fields, the Sort processor sorts data according to the order of the listed fields on the Sort tab.
grade
last_name
first_name
You preview the pipeline with sample data. Preview displays the following input and output data for the Sort processor, showing how the record order has changed:
Notice how grade 2 students are listed in the last three records in the input data, but the processor reorders them as the first three records in the output data. The output data also shows how the processor additionally sorts the grade 2 students alphabetically by last name and then by first name.
Configuring a Sort Processor
Configure a Sort processor to sort incoming data based on specified fields.
-
In the Properties panel, on the
General tab, configure the following properties:
General Property Description Name Stage name. Description Optional description. Cache Data Caches data processed for a batch so the data can be reused for multiple downstream stages. Use to improve performance when the stage passes data to multiple stages. Caching can limit pushdown optimization when the pipeline runs in ludicrous mode.
-
On the Sort tab, configure the following properties for
the field that you want to sort by:
Sort Property Description Field Name of the field in the input data to sort by. Order Order to sort the data: - Ascending
- Descending
-
To sort by additional fields, click the Add icon to
specify another field name and sort order.
You can use simple or bulk edit mode to configure the fields.When configured to sort by multiple fields, the processor sorts data according to the order of the listed fields on the Sort tab.