Rules and Alerts

Rules and Alerts Overview

You can define rules to enable capturing information about a running pipeline. You can enable an alert for any rule to be notified when the specified condition occurs.

Rules and alerts for pipelines managed by Control Hub are triggered based on the data processed for each local pipeline instance, by the execution Data Collector that runs it. Rules and alerts for pipelines run by standalone Data Collectors are triggered based on the pipeline data that the Data Collector processes.

You can create the following types of rules:
  • Metric rule - Gathers statistics about the pipeline such as pipeline idle time or error record counts. Provides an alert when enabled.
  • Data rule - Gathers details about data as it passes between two stages. Can provide a meter and alert.
  • Data drift rule - Gathers details about data drift as data passes between two stages. Can provide a meter and alert.
When you enable alerts, you can be informed of the alert in the following ways:
  • Webhooks - All alerts trigger all configured webhooks.
  • Email - You can configure rules to send email alerts.
  • Control Hub UI - Triggered alerts display on the Control Hub Alerts view and as an alert when you monitor the job running the pipeline.

Metric Rules and Alerts

Metric rules and alerts provide notifications about real-time statistics for pipelines.

You can define and enable metric rules so that you are sent an alert when a statistic reaches a certain threshold.

When you enable a metric rule, it automatically enables an alert for the rule. You can also configure a metric rule to send email alerts to all email addresses associated with the pipeline.

You configure metric rules when you configure the pipeline. Data Collector provides a set of default metric rules that you can edit and enable for any pipeline. Metric rules take effect after you enable them.

You can also create custom metric rules. When you create a custom metric rule, you select the metric type. The metric type determines which statistic triggers the alert. You configure the condition that triggers the alert, and enter the text to display in the alert.

Default Metric Rules

Data Collector provides a set of default metric rules that you can edit and enable for any pipeline.

You might want to edit a default metric rule to modify the alert text or the condition for the rule. By default, none of the rules are enabled. Select Active to enable a rule.

Data Collector provides the following default metric rules:

Metric Types

You can use different metric types when you create a metric rule. The metric type determines which statistic triggers the alert.

After selecting a metric type, you select the metric ID which specifies the metric to use. For example, a metric ID can be a runtime statistics gauge or an input records meter. You then select the metric element that defines what the metric is measuring. A metric element can be a count, rate, median, minimum, maximum, or percentage. The possible metric ID and metric element vary by metric type.

Gauge

The gauge metric type provides alerts based on the number of input, output, or error records for the last processed batch. It also provides alerts on the age of the current batch, the amount of time a stage takes to process a batch, or the time that Data Collector last received a record from the origin.

The gauge metric type includes a single metric ID, Runtime Statistics Gauge. You can configure the alert to trigger on the following metric elements:
  • Last Batch Input, Output, or Error Records Counts
  • Last Batch Error Messages Count
  • Current Batch Age
  • Time in Current Stage
  • Time of Last Received Record

For example, you can configure a gauge metric rule that triggers an alert when the pipeline has been processing a batch for more than 5 minutes.

Counter

The counter metric type provides alerts based on the number of input, output, or error records for the pipeline or for a stage in the pipeline.

The counter metric type includes the following metric IDs:
  • Pipeline batch count.
  • Number of input records, output records, error records, or stage errors for the pipeline or for a stage in the pipeline.

For any of the selected metric IDs, you can configure the alert to trigger on the count metric element.

For example, you can configure a counter metric rule that triggers an alert when a pipeline encounters more than 1,000 error records.

Histogram

The histogram metric type provides alerts based on a histogram of different record types and stage errors for the pipeline or for a stage in the pipeline.

The histogram metric type provides alerts about the Records Per Batch Histogram statistics for a pipeline or a stage.

The histogram metric type includes metric IDs for the input records, output records, error records, or stage errors for the pipeline or for a stage in the pipeline. You can configure the alert to trigger on the metric elements displayed in the monitoring histogram: mean, standard deviation, percentage, or count.

For example, you can configure a histogram metric rule that triggers an alert when the mean of all input records processed by the pipeline reaches 10,000.

Meter

The meter metric type provides alerts based on rates of different record types and stage errors for pipelines or for a stage in the pipeline.

The meter metric type can provide alerts about the number of batches processed by the pipeline. The meter metric type can also provide alerts about the Record Count and Record Throughput statistics for the pipeline or a stage.

The meter metric type includes metric IDs for the pipeline batch count and for the input records, output records, error records, or stage errors for the pipeline or for a stage in the pipeline. You can configure the alert to trigger on the following metric elements displayed in the Record Count and Record Throughput statistics: count, time rates, or mean.

For example, you can configure a meter metric rule that triggers an alert when the number of output records that a stage processes reaches 5,000 in one minute.

Timer

The timer metric type provides alerts based on batch processing timers for the pipeline or for a stage in the pipeline.

The timer metric type provides alerts about the Batch Throughput and Batch Processing Timer statistics for the pipeline or a stage.

The timer metric type includes the following metric IDs:
  • Pipeline Batch Processing Timer - Amount of time for the pipeline to process a batch.
  • <stage_name> Batch Processing Timer - Amount of time for a stage to process a batch.

You can configure the alert to trigger on the following metric elements displayed in the Batch Processing Timer statistics: mean, standard deviation, percentage, time rates, or count.

For example, you can configure a timer metric rule that triggers an alert when the mean amount of time that the pipeline takes to process a batch reaches 10 minutes.

Metric Conditions

When you configure a metric rule, you configure the condition that defines the threshold at which the metric rule triggers an alert. Use the expression language to configure the condition.

The expression language provides the following functions for creating metric rule conditions:

value()
Returns the value of the current metric selected in the metric rule. Use in conditions for any type of metric rule.
For example, the default rule for the Pipeline Error Records Counter metric includes the following condition:
${value() > 100}

The alert is triggered when the pipeline encounters more than 100 error records.

time.now()
Returns the current time of the Data Collector machine as a java.util.Date object. Use in conditions for gauge metric rules.
For example, the default rule for the Runtime Statistics Gauge metric that checks whether the pipeline is idle includes the following condition:
${time:now() - value() > 120000}

The alert is triggered when the current time is greater than the time of the last received record by 120,000 milliseconds.

For more information about using the expression language, see Expression Language.

Configuring a Metric Rule and Alert

Create a custom metric rule to receive alerts when a real-time statistic reaches a certain threshold. You can create metric rules and alerts when you configure a pipeline. You can edit or delete metric rules when they are not enabled.

Configure metric rules based on the metric type, the metric, an element associated with the metric, and a condition.
  1. In the Properties panel, click the Metric Rules tab, and click the Add icon.
  2. In the Metric Rule dialog box, configure the following properties:
    Metric Rule Property Description
    Alert Text Text to display when the alert is triggered.

    Enter text that explains the reason for the alert. For example, "Over 1000 pipeline error records."

    Metric Type Type of metric information the alert is based on:
    • Gauge
    • Counter
    • Histogram
    • Meter
    • Timer
    Metric ID Metric to use. Provides a list of available metrics based on the metric type.
    Metric Element Metric element to use. Provides a list of available elements based on the metric ID.
    Condition Condition to trigger the alert. Use the expression language to configure the condition.
    Send Email Sends an email when the alert is triggered.

    To send an email, add email addresses for alerts and configure the Data Collector email properties. For more information, see Configuring Email for Alerts.

  3. Click Save.
    The new metric rule displays in the list.
  4. To enable the rule, select Active.
    You can enable and disable email alerts from the list of rules.

Data Rules and Alerts

Data rules define the information that you want to see about the data that passes between stages. You can create data rules based on any link in the pipeline. You can also enable metrics and create alerts for data rules.

You can configure data rules when you configure the pipeline. To create a data rule, you need familiarity with the data being processed. You might preview data to help determine how to configure data rules.

Configuring a Data Rule and Alert

Create a data rule to view metrics, sample data, and alerts based on the rule. You can configure data rules and alerts when you configure a pipeline. You can edit or delete data rules when they are not enabled.
  1. In the Properties panel, click the Data Rules tab, and then click the Add icon.
  2. In the Data Rule dialog box, configure the following properties:
    Data Rule Property Description
    Stream Link selected for the data rule.
    Label The label to display for the data rule.
    Condition Condition that defines the data rule. Use the expression language to configure the condition.

    For more information about using the expression language, see Expression Language.

    Sampling Percentage Percentage of records to sample to generate information for the data rule. A higher percentage can provide greater accuracy but use more resources on the Data Collector machine.
    Sampling Records to Retain Number of sampled records to keep in memory for display.
    Enable Meter Enables gathering information for the data rule.
    Enable Alert Enables an alert based on the data rule. Alerts display when the configured conditions occur.
    Alert Text Text to display when the alert is triggered.

    Enter text that explains the reason for the alert. For example, "Over 100 missing phone numbers."

    You can use the expression language to define the alert text. For example, use the record:errorMessage() function to display the error message in the alert text. For more information about using the expression language, see Expression Language.

    Threshold Type Type of threshold that defines when the alert becomes active:
    • Count - A specified number of records.
    • Percentage - A specified percentage of records.
    Threshold Value Value that defines the threshold at which the rule triggers an alert.
    Min Volume Minimum number of records to process before evaluating a percentage threshold type.
    Send Email Sends an email when the alert is triggered.

    To send an email, add email addresses for alerts and configure the Data Collector email properties. For more information, see Configuring Email for Alerts.

  3. Click Save.
    The new data rule displays in the list.
  4. To enable the new rule, click Active.
    You can enable and disable the meter and alert from the list of rules. You can edit and delete disabled rules.

Data Drift Rules and Alerts

You can create data drift rules to indicate when the structure of data changes. You can create data drift rules on any link in the pipeline. You can also enable metrics and create alerts for data drift rules.

The expression language provides data drift functions for creating data drift rules. You can use specific field types with each function. The following table describes the type of data drift rules that you can generate on the different field types:
Data Drift Rule Drift Function Valid Field Data Types
Field name changes drift:names() list-map

map

Field order changes drift:order() list-map
Number of fields drift:size() list

list-map

map

Field data type drift:type() any

For details about the data drift functions, see Data Drift Functions.

Data Drift Alert Triggers

Data drift alerts trigger when a change of the specified type occurs from record to record.

For example, you have an alert that triggers when the number of fields in the record changes. When processing the records with the following number of columns, an alert triggers for both the third and fourth records:
Record Number Number of Columns
1 10
2 10
3 15
4 10

Data drift functions include an ignoreWhenMissing flag to determine the behavior when the specified field does not exist. When the specified field is missing and ignoreWhenMissing is set to true, an alert is not triggered.

When the specified field is missing and the ignoreWhenMissing flag is set to false, the expression triggers an alert for the missing field, and again for the next record when the field is present.

For example, the following expression checks the data type of the ID column with ignoreWhenMissing set to false:
${drift:type('/UserID', false)}

Say all records include the UserID field, and then a single record passes without the UserID field. This expression triggers an alert for the record with the missing field, and again when the next record arrives that includes the UserID field.

Configuring Data Drift Rules and Alerts

Create a data drift rule to view metrics, sample data, and alerts based on the rule. You can configure data drift rules and alerts when you configure a pipeline. You can edit or delete data drift rules when they are not enabled.
  1. In the Properties panel, click the Data Drift Rules tab, and then click the Add icon.
  2. In the Data Drift Rule dialog box, configure the following properties:
    Data Rule Property Description
    Stream Link selected for the data drift rule.
    Label The label to display for the data drift rule.
    Condition Condition that defines the data drift rule. You can use data drift functions and other aspects of the expression language to configure the condition.
    Sampling Percentage Percentage of records to sample to generate information for the data drift rule. A higher percentage can provide greater accuracy but use more resources on the Data Collector machine.
    Sampling Records to Retain Number of sampled records to keep in memory for display.
    Enable Meter Enables gathering information for the data drift rule.
    Enable Alert Enables an alert based on the data drift rule. Alerts display upon when the configured conditions occur.
    Alert Text Text to display when the alert is triggered. You can use the expression language to define the alert text. For more information about using the expression language, see Expression Language.

    By default, uses the following expression to return text related to the drift alert: ${alert:info()}.

    Send Email Sends an email when the alert is triggered.

    To send an email, add email addresses for alerts and configure the Data Collector email properties. For more information, see Configuring Email for Alerts.

  3. Click Save.
    The new data drift rule displays in the list.
  4. To enable the new rule, click Active.
    You can enable and disable the meter and alert from the list of rules. You can edit and delete disabled rules.

Alert Webhooks

You can configure webhooks that are sent when alerts are triggered. A webhook is a user-defined HTTP callback - an HTTP request that the pipeline sends automatically when certain actions occur. You can use webhooks to automatically trigger external tasks based on an HTTP request. Tasks can be as simple as sending a message through an application API or as powerful as passing commands to the Data Collector command line interface.

The pipeline sends all alert webhooks each time an alert is triggered. So when you configure an alert webhook, create a webhook payload that is applicable for all triggered alerts. You can configure a payload that includes the details of each alert.

Important: You must configure webhooks as expected by the receiving system. For details on how to configure incoming webhooks check the receiving system's documentation. You might also need to enable webhook usage within that system.

When you configure an alert webhook, you specify the URL to send the request and the HTTP method to use. Some HTTP methods allow you to include a request body or payload. In the payload, you can use parameters to include information about the cause of the trigger, such as the pipeline that triggered the alert and the alert details. You can also include request headers, content type, authentication type, username and password as needed.

For details about webhook methods, payloads and parameters, see Webhooks.

Configuring an Alert Webhook

Configure an alert webhook to automatically send an HTTP request each time the pipeline triggers an alert.

  1. To view pipeline configuration options, click an unused section of the pipeline canvas.
  2. In the Properties panel, click Rules, and then click Webhooks.
  3. On the Webhooks tab, configure the following properties:
    Webhook Property Description
    Webhooks Webhook to send when an alert triggers. Using simple or bulk edit mode, click the Add icon to add additional webhooks.
    Webhook URL URL to send the HTTP request.
    Headers Optional HTTP request headers.
    HTTP Method HTTP method. Use one of the following methods:
    • GET
    • PUT
    • POST
    • DELETE
    • HEAD
    Payload Optional payload to include. Available for PUT, POST, and DELETE methods.

    Use any valid content type.

    You can use webhook parameters in the payload to include information about the triggering event, such as the alert name or condition. Enclose webhook parameters in double curly brackets as follows: {{ALERT_NAME}}.

    Content Type Optional content type of the payload. Configure this property when the content type is not declared in the request headers.
    Authentication Type Optional authentication type to include in the request. Use None, Basic, Digest, or Universal.

    Use Basic for Form authentication.

    User Name User name to include when using authentication.
    Password Password to include when using authentication.
  4. To create an additional webhook, click the Add icon.

Configuring Email for Alerts

You can define the email addresses to receive metric and data alerts. When an alert triggers an email, the Data Collector sends an email to every address in the list.

To send email alerts, create an email account to send the alerts and define the email alert properties in the Data Collector configuration properties.

For information about configuring these properties, see the email alert table in Configuring Data Collector.

  1. To view pipeline configuration options, click an unused section of the pipeline canvas.
  2. In the Properties panel, click the Rules tab, and then click the Notifications tab.
  3. On the Notifications tab, configure the following properties:
    Notifications Property Description
    Email IDs Email addresses that receive every email alert generated by the pipeline.

    You can use simple or bulk edit mode. In simple mode, click the Add icon to specify the first address and the Add Another icon to specify an additional address.

    Error Information Level Amount of information included in an email notification triggered by an error:
    • All error details
    • Only the error code
    • Error notification with no details
    Note: Error details can include sensitive information.