Text Data Format with Custom Delimiters

By default, the text data format creates records based on line breaks, creating a record for each line of text. You can configure origins to create records based on custom delimiters.

Use custom delimiters when the origin system uses delimiters to separate logical sections of data that you want to use as records. A custom delimiter might be as simple as a semicolon or might be a set of characters. You can even use an XML tag as a custom delimiter to read XML data.

Note: When using a custom delimiter, the origin uses the delimiter characters to create records, ignoring new lines.

For most origins, you can include the custom delimiters in records or you can remove them. For the Hadoop FS and MapR FS origins, you cannot include the custom delimiters in records.

For example, say you configure the Directory origin to process a file with the following text, using a semicolon as a delimiter, and discarding the delimiter:

8/12/2016 6:01:00 unspecified error message;8/12/2016 
6:01:04 another error message;8/12/2016 6:01:09 just a warning message;

The origin generates the following records, with the data in a single text field:

Text
8/12/2016 6:01:00 unspecified error message
8/12/2016 6:01:04 another error message
8/12/2016 6:01:09 just a warning message

Note that the origin retains the line break, but does not use it to create a separate record.