Column Order

The Column Order processor changes the selected columns to the listed order. By default, all other columns are included in rows to the right of the specified columns, in their relative order.

When you configure the Column Order processor, you specify column names in the order that you want them to appear in the row, from left to right. You can use regular expressions to define groups of columns. You can also configure the processor to exclude the columns that are not specified from the generated rows.

The Column Order processor is not case sensitive. When generating a row, the processor uses the exact column names that you specify in the processor.

Example

You have data from different sources that provide different sets of data, but all of the data contains a subset of columns that you want to use. The column names are the same, but there are differences in capitalization.

You configure the Columns to Order property in the processor to list the following subset of columns that you want to use, in the order you want them to appear, and with the capitalization that you want to standardize on:
  • UserID
  • Transaction
  • Date

You also clear the Keep Non-Matched Columns property to exclude other columns from the data.

When the pipeline runs, rows with the following columns pass to the Column Order processor:

  • Row 1: date, transaction, payment_type, total, userid, storeid
  • Row 2: StoreId, Userid, Transaction, Date, Total
  • Row 3: date, storeID, userId, transaction
The Column Order processor alters the three rows, generating rows with the following columns:
  • Row 1: UserID, Transaction, Date
  • Row 2: UserID, Transaction, Date
  • Row 3: UserID, Transaction, Date

Notice, the order and capitalization of the columns now match the processor configuration, and the columns that were not specified in the processor are excluded from the data.

If you enable the Keep Non-Matched Columns property, the processor generates the following rows:
  • Row 1: UserID, Transaction, Date, payment_type, total, storeid
  • Row 2: UserID, Transaction, Date, StoreID, Total
  • Row 3: UserID, Transaction, Date, storeID

Notice, the columns that are not specified in the processor are added to the right of the specified fields, with their original capitalization, in the same relative order.

Column Order Logic

Note the following details when configuring a Column Order processor:
  • If a column matches more than one specified column name or regular expression, the order of the first match is used.
  • If a listed column name or regular expression has no matching columns, it is ignored.
  • If no columns are matched, then they remain in the original order.

Configuring a Column Order Processor

Configure a Column Order processor to change the order of columns in the data.

  1. On the General tab, configure the following properties:
    General Property Description
    Name Stage name.
    Description Optional description.
    Cache Data Caches processed data.
  2. On the Order tab, configure the following properties:
    Table 1.
    Column Order Property Description
    Columns to Order Columns to place in the specified order. Place columns in the order that you want them to appear in the row, from left to right.

    You can enter column names or select columns from preview data. If you specify multiple columns, you can drag them into the appropriate order.

    You can also specify regular expressions to define groups of columns. For more information, see Column Order Logic.

    You can use simple or bulk edit mode to configure the property. Use bulk edit mode to easily change the column order.

    Keep Non-Matched Columns Keeps columns in the incoming data that are not specified in the Columns to Order property. When kept, the columns are included in rows in the same relative order, to the right of the columns specified in the Columns to Order property.

    When cleared, columns that are not specified in the Columns to Order property are excluded from the data.