XML Parser

Supported pipeline types:
  • Data Collector

The XML Parser parses a well-formed XML document embedded in a string field and passes the parsed data to an output field in the record.

When you configure the XML Parser, you specify the field that contains the XML document and the target field for the parsed results. You can define a delimiter element to separate the document into multiple values. When no delimiter element is defined, XML Parser passes the entire document to the target field as a map.

When defining the delimiter element, you can use an XML element or simplified XPath expression. Use an XML element when the element resides directly under the root node. Use a simplified XPath expression to access data deeper in the XML document.

When an XML document has more than one value, you can return the first value, all values as a list, or generate a record for each value in the document.

When generating a record, the processor includes all other incoming fields in the generated record. When generating multiple records because of multiple values in the parsed field, the processor includes the other incoming fields for each generated record.

You can configure the processor to include the XPath to each parsed XML element and XML attribute in field attributes. This also places each namespace in an xmlns record header attribute.

You can also configure the processor to include XML attributes and namespace declarations in the record as a field attributes. By default, it includes XML attributes and namespace declarations in the record as fields.
Note: Field attributes and record header attributes are written to destination systems automatically only when you use the SDC RPC data format in destinations. For more information about working with field attributes and record header attributes, and how to include them in records, see Field Attributes and Record Header Attributes.

For more information about how XML Parser processes XML data, see Reading and Processing XML Data.