Column Order
The Column Order processor changes the selected columns to the listed order. By default, all other columns are included in rows to the right of the specified columns, in their relative order.
When you configure the Column Order processor, you specify column names in the order that you want them to appear in the row, from left to right. You can use regular expressions to define groups of columns. You can also configure the processor to exclude the columns that are not specified from the generated rows.
The Column Order processor is not case sensitive. When generating a row, the processor uses the exact column names that you specify in the processor.
Example
You have data from different sources that provide different sets of data, but all of the data contains a subset of columns that you want to use. The column names are the same, but there are differences in capitalization.
- UserID
- Transaction
- Date
You also clear the Keep Non-Matched Columns property to exclude other columns from the data.
When the pipeline runs, rows with the following columns pass to the Column Order processor:
- Row 1: date, transaction, payment_type, total, userid, storeid
- Row 2: StoreId, Userid, Transaction, Date, Total
- Row 3: date, storeID, userId, transaction
- Row 1: UserID, Transaction, Date
- Row 2: UserID, Transaction, Date
- Row 3: UserID, Transaction, Date
Notice, the order and capitalization of the columns now match the processor configuration, and the columns that were not specified in the processor are excluded from the data.
- Row 1: UserID, Transaction, Date, payment_type, total, storeid
- Row 2: UserID, Transaction, Date, StoreID, Total
- Row 3: UserID, Transaction, Date, storeID
Notice, the columns that are not specified in the processor are added to the right of the specified fields, with their original capitalization, in the same relative order.
Column Order Logic
- If a column matches more than one specified column name or regular expression, the order of the first match is used.
- If a listed column name or regular expression has no matching columns, it is ignored.
- If no columns are matched, then they remain in the original order.
Configuring a Column Order Processor
Configure a Column Order processor to change the order of columns in the data.
-
On the General tab, configure the following
properties:
General Property Description Name Stage name. Description Optional description. Cache Data Caches processed data. -
On the Order tab, configure the following
properties:
Table 1. Column Order Property Description Columns to Order Columns to place in the specified order. Place columns in the order that you want them to appear in the row, from left to right. You can enter column names or select columns from preview data. If you specify multiple columns, you can drag them into the appropriate order.
You can also specify regular expressions to define groups of columns. For more information, see Column Order Logic.
You can use simple or bulk edit mode to configure the property. Use bulk edit mode to easily change the column order.
Keep Non-Matched Columns Keeps columns in the incoming data that are not specified in the Columns to Order property. When kept, the columns are included in rows in the same relative order, to the right of the columns specified in the Columns to Order property. When cleared, columns that are not specified in the Columns to Order property are excluded from the data.