Field Mapper
Supported pipeline types:
|
- The part of the fields that the mapping expression alters, either the path, name, or value.
- An optional conditional expression that specifies the affected fields.
- An expression that specifies how to alter the path, name, or value
When necessary, you configure the processor to change record structure, to append rather than overwrite records, and to copy existing fields leaving current fields in place.
Field Values
You can use the Field Mapper processor to map an expression to a set of fields to alter field values. You might alter field values by converting negative numbers to positive numbers, setting all negative numbers to zero, or appending a prefix to any string value.
- Operate on field values.
- Use a conditional expression to operate only on field values when the value in a field is an integer or double.
- Use a mapping expression to write the current value when the value is greater than zero, otherwise multiply the value by -1 to convert to a positive number.
Mapper Property | Value |
---|---|
Operate On | Field Values |
Conditional Expression | ${f:type() == 'INTEGER' or f:type() == 'DOUBLE'} |
Mapping Expression | ${f:value() > 0 ? f:value() : -1 * f:value()} |
The following image shows that the Field Mapper processor changes negative numbers in input data to positive numbers in output data:
Field Names
You can use the Field Mapper processor to apply a mapping expression that uniformly alters the names of a set of fields. You might alter field names by replacing special characters or adding a suffix. When altering field names, you can configure the processor to change record structure, to append rather than overwrite records, and to copy existing fields leaving current fields in place.
- Operate on field names.
- Use a mapping expression that replaces special characters in the name with an underscore.
Because you want the mapping expression applied to all field names, you do not need to configure a conditional expression.
Mapper Property | Value |
---|---|
Operate On | Field Names |
Mapping Expression | ${str:replaceAll(f:name(), '[.\\-/()^]','_')} |
The following image shows that the Field Mapper processor changes the hyphen in
f1-values
and the slash in value/list
to
underscores:
Field Paths
You can use the Field Mapper processor to map an expression to a set of fields to alter the field paths. You might alter field paths to group fields by type or to copy fields with suspicious names and values to an alternative path to be examined. When mapping multiple fields to a single field path, specify an aggregation expression that defines how to aggregate the fields in that path. When altering field paths, you can configure the processor to change record structure, to append rather than overwrite records, and to copy existing fields leaving current fields in place.
Example: Grouping Fields by Type
Suppose you want to group fields with the same data types. You can use the Field Mapper processor to move fields to an already defined map field that contains map fields for each field type.
- Operate on field paths.
- Use a mapping expression that moves fields under the appropriate field type.
- Clear the aggregation expression so the processor writes the field value to the field record rather than a list that contains the value.
- Enable the processor to change the structure of the record and write new fields.
Note that the input record must already contain map fields for the path, in this case map fields for all the field types, such as /outputs/STRING, /outputs/INTEGER, and /outputs/DOUBLE.
Mapper Property | Value |
---|---|
Operate On | Field Paths |
Mapping Expression | /outputs/${f:type()}/${f:name()} |
Aggregation Expression | None Note: Clear the default expression because the mapping
expression maps only one field to each field
path.
|
Structure Change Allowed | Enabled |
The following image shows that the Field Mapper moves fields with values under the
outputs
field by appropriate field type:
Example: Copying Suspicious Fields
Suppose you want to identify suspicious fields but leave the fields in place until
you can examine them. You consider a field suspicious if the field name contains the
text name
and the value contains a special character. You can use
the Field Mapper processor to map an expression to suspicious fields. The expression
maps the suspicious fields to an already defined map field for reviewing. When
writing to the review field, the processor creates a map field for each suspicious
field and lists the original path from the record along with its value.
- Operate on field paths.
- Use a conditional expression to operate only on fields that match your definition of suspicious.
- Use a mapping expression to write the matching fields to a new path.
- Use an aggregation expression to create a map for each field written to the new path and list the original path and value in the field.
In this case, you also configure the processor to change the structure of the record
and to maintain the original paths by copying the suspicious fields. The input
record must contain a map field for the path, in this case,
review
.
Mapper Property | Setting |
---|---|
Operate On | Field Paths |
Conditional Expression | ${f:type() == 'STRING' and str:contains(f:name(), 'name') and str:matches(f:value(), '^[;\\-*".&*\\^].*')} |
Mapping Expression | /review/suspiciousNames |
Aggregation Expression | ${map(fields, fieldByPreviousPath())} |
Structure Change Allowed | Enabled |
Maintain Original Paths | Enabled |
The following image shows that the Field Mapper processor copies suspicious fields to
the review
field, where it lists the original path to the field and
its value:
Aggregation Expressions
- Return a single value, such as the sum or the minimum or maximum of the field values.
- Return a list of values from the fields.
- Return a list of strings containing the original field paths - that is, the field paths before applying the mapping.
- Return a list of field maps, each containing the original field path and value.
In the aggregation expression, you can use a numeric function or a field function. When
calling a numeric or field function, use a special variable, fields
, to
pass the list of fields to the function. If the processor has a conditional expression,
the aggregation expression passes the fields that match the conditional expression.
Otherwise, the aggregation passes all the fields.
Numeric Functions
Input | Returned |
---|---|
Integer or Long | Long |
Float or Double | Double |
Decimal | Decimal |
- sum(fields)
- Returns the sum of the values in the specified fields.
- min(fields)
- Returns the minimum value of the specified fields.
- max(fields)
- Returns the maximum value of the specified fields.
For example, to return the sum of values, set your aggregation expression to:
${sum(fields)}
Field Functions
Use field functions to return values from fields, field paths from fields, or field paths and values from fields.
- asFields(<list of values>)
- Returns a list of string fields from a list of values. You must use with
the
map
andpreviousPath
functions to return a list of strings that contain paths to fields.In the aggregation expression, enter:
${asFields(map(fields, previousPath()))}
- fields
- Returns a list of values from the fields.
- map(fields,<path function>)
- Returns a list of the values or map fields returned by a path function
for each passed field. Available path functions include:
- previousPath()
- Returns a value that contains the path of each passed field.
Use in the
asFields
function to return a list of strings that contain field paths of fields. - fieldByPreviousPath()
- Returns a map field for each passed field. The map field contains a single entry, with a key equal to the field path of the passed field and the value equal to the value of the passed field.
Field Path | Value |
---|---|
/f1-names/first | Ann |
/f2-name/first | Bob |
output
field, you can use field functions in the aggregation
expression to return either the values, the field paths, or the field paths and values:- Return list of values
- Set the aggregation expression
to:
${fields}
The processor returns the following output:
- Return list of field paths
- Set the aggregation expression to:
${asFields(map(fields, previousPath()))}
The processor returns the following output:
- Return list of field paths and values
- Set the aggregation expression to:
${map(fields, fieldByPreviousPath())}
The processor returns the following output:
Configuring a Field Mapper Processor
Configure a Field Mapper processor to map an expression to a set of fields in a record.
-
In the Properties panel, on the General tab, configure the
following properties:
General Property Description Name Stage name. Description Optional description. Required Fields Fields that must include data for the record to be passed into the stage. Tip: You might include fields that the stage uses.Records that do not include all required fields are processed based on the error handling configured for the pipeline.
Preconditions Conditions that must evaluate to TRUE to allow a record to enter the stage for processing. Click Add to create additional preconditions. Records that do not meet all preconditions are processed based on the error handling configured for the stage.
On Record Error Error record handling for the stage: - Discard - Discards the record.
- Send to Error - Sends the record to the pipeline for error handling.
- Stop Pipeline - Stops the pipeline. Not valid for cluster pipelines.
-
On the Mapper tab, configure the following
properties:
Mapper Property Description Operate On Part of the field to be altered: - Field Paths
- Field Names
- Field Values
Conditional Expression Expression that specifies the set of fields in the record to alter. When not used, Field Mapper applies the mapping expression to all fields in the record. The expression can include field, math, record, string, or time functions.
Mapping Expression Expression applied to the paths, names, or values of matching fields. You can use field, math, record, string, or time functions.
If mapping to paths, the paths must already exist in the record.
Aggregation Expression Expression that specifies how to aggregate multiple fields into the same path. Enter an expression when the mapping expression maps multiple fields to the same path. You can use processor-specific numeric or field functions in the aggregation expression. You can also use math, record, string, or time functions.
The default value is
${fields}
, which returns a list of the values from each field.Structure Change Allowed Enables the processor to change the structure of the record, such as by adding new fields or changing data types. Available when operating on field paths or field names.
Append List Values Enables the processor to append values to list fields when the mapping expression maps fields to an existing list field. When not enabled, the processor replaces existing values in the list field. Available when operating on field paths or field names.
Maintain Original Paths Enables the processor to keep existing fields when a mapping expression maps a field to a new name or path. When not enabled, the processor removes existing fields when creating new names or paths for fields. Available when operating on field paths or field names.