Error Record Handling
You can configure error record handling at a stage level and at a pipeline level. You can also specify the version of the record to use as the basis for the error record.
When an error occurs as a stage processes a record, Data Collector handles the record based on the stage configuration. One of the stage options is to pass the record to the pipeline for error handling. For this option, Data Collector processes the record based on the pipeline error record handling configuration.
When you configure a pipeline, be aware that stage error handling takes precedence over pipeline error handling. That is, a pipeline might be configured to write error records to file, but if a stage is configured to discard error records those records are discarded. You might use this functionality to reduce the types of error records that are saved for review and reprocessing.
Note that records missing required fields do not enter the stage. They are passed directly to the pipeline for error handling.
Pipeline Error Record Handling
Pipeline error record handling determines how Data Collector processes error records that stages send to the pipeline for error handling. It also handles records deliberately dropped from the pipeline such as records without required fields.
The pipeline handles error records based on the Error Records property on the Error Records tab. When Data Collector encounters an unexpected error, it stops the pipeline and logs the error.
- Discard
- The pipeline discards the record. Data Collector includes the records in error record counts and metrics.
- Send Response to Origin
- The pipeline passes error records back to the microservice origin to be included in a response to the originating REST API client. Data Collector includes the records in error record counts and metrics. Use in microservice pipelines only.
- Write to Amazon S3
- The pipeline writes error records and related details to Amazon S3. Data Collector
includes the records in error record counts and metrics.
You define the Amazon S3 configuration properties.
- Write to Azure Event Hub
- The pipeline writes error records and related details to Microsoft Azure Event Hub. Data Collector includes the records in error record counts and metrics.
- Write to Elasticsearch
- The pipeline writes error records and related details to Elasticsearch. Data Collector includes the records in error record counts and metrics.
- Write to File
- The pipeline writes error records and related details to a local directory. Data Collector includes the records in error record counts and metrics.
- Write to Google Cloud Storage
- The pipeline writes error records and related details to Google Cloud Storage. Data Collector includes the records in error record counts and metrics.
- Write to Google Pub/Sub
- The pipeline writes error records and related details to Google Pub/Sub. Data Collector includes the records in error record counts and metrics.
- Write to Kafka
- The pipeline writes error records and related details to Kafka. Data Collector includes the records in error record counts and metrics.
- Write to Kinesis
- The pipeline writes error records and related details to Amazon Kinesis Streams. Data Collector includes the records in error record counts and metrics.
- Write to MapR Streams
- The pipeline writes error records and related details to MapR Streams. Data Collector includes the records in error record counts and metrics.
- Write to MQTT
- The pipeline writes error records and related details to an MQTT broker. Data Collector includes the records in error record counts and metrics.
Stage Error Record Handling
Most stages include error record handling options. When an error occurs when processing a record, Data Collector processes records based on the On Record Error property on the General tab of the stage.
- Discard
- The stage silently discards the record. Data Collector does not log information about the error or note the specific record that encountered an error. The discarded record is not included in error record counts or metrics.
- Send to Error
- The stage sends the record to the pipeline for error handling. The pipeline processes the record based on the pipeline error handling configuration.
- Stop Pipeline
- Data Collector stops the pipeline and logs information about the error. The error that stopped the pipeline displays as an error in the pipeline history.
Example
An origin reads JSON data with a maximum object length of 4096 characters and the origin encounters an object with 5000 characters. Based on the stage configuration, Data Collector either discards the record, stops the pipeline, or passes the record to the pipeline for error record handling.
- When the pipeline discards error records, Data Collector discards the record without noting the action or the cause.
- When the pipeline writes error records to a destination, Data Collector writes the error record and additional error information to the destination. It also includes the error records in counts and metrics.
Error Records and Version
When Data Collector creates an error record, it preserves the data and attributes from the record that triggered the error, and then adds error related information as record header attributes. For a list of the error header attributes and other internal header attributes associated with a record, see Internal Attributes.
- The original record - The record as originally generated by the origin. Use this record when you want the original record without any additional pipeline processing.
- The current record - The record in the stage that generated the error. Depending
on the type of error that occurred, this record can be unprocessed or partially
processed by the error-generating stage.
Use this record when you want to preserve any processing that the pipeline completed before the record caused an error.