Data Preview
Data Preview Overview
You can preview data to help build or fine-tune a pipeline. When using Control Hub, you can also use data preview when developing pipeline fragments.
You can use data preview with complete or incomplete pipelines and fragments. And you can choose from several options to provide source data for the preview.
When you preview data, source data passes through the pipeline or fragment, allowing you to review how the data passes and changes through each stage. You can edit stage properties and run the preview again to see how your changes affect the data. You can also edit preview data to test and tune the pipeline logic.
You can preview data for one stage at a time or for a group of stages. You can also view the data in list or table view, and refresh the preview data.
After running preview, you can view the input and output schema for each stage on the Schema tab in the pipeline properties panel.
Data Preview Availability
You can preview complete and incomplete pipelines and Control Hub pipeline fragments. The Data Preview icon becomes active when data preview is available.
- All stages in the pipeline are connected
- All required properties are defined
Source Data for Data Preview
- Data from the origin - Use available data from the origin.
- Data from the test origin - Use data from the test origin configured in the pipeline or fragment properties.
- Data from a snapshot - Use snapshot data from the same pipeline, another pipeline, or from an active job. Available for pipelines only.
Writing to Destinations and Executors
Since data preview is a tool for development, by default, it does not write data to destination systems or pass data to executors connected to destination stages.
Data preview also does not display the data that is written by destinations in the pipeline. You can, however, view the data that is passed to a destination stage, which is typically similar to what is written to destination systems.
If you like, you can configure the preview to write data to destination systems and to trigger executors connected to destination stages. For example, you might enable writing to an executor to verify that it performs the configured task as expected.
To write to destination systems and connected executors, in the Preview Configuration dialog box, select Write to Destinations and Executors.
Notes
- Date, datetime, and time data - Data preview displays date,
datetime, and time data using the default format of the browser locale. For
example, if the browser uses the en_US locale, preview displays dates using
the following format: MMM d, y h:mm:ss a.
Data preview displays date, datetime, and time data using the time zone that you select in the preview configuration. By default, data preview displays data using the browser time zone.
- Oracle CDC and Oracle CDC Client pipelines - When previewing a pipeline that Oracle CDC data, data preview might time out before connecting to the origin system. When this occurs, try increasing the timeout to 120,000 milliseconds to allow the origin time to connect.
- Whole file data format - When previewing a pipeline that processes whole file data, data preview displays only one record.
Preview Codes
Data preview displays different colors for different types of data. Preview also uses other codes and formatting to highlight changed fields.
Preview Code | Description |
---|---|
Black values | Date data |
Blue values | Numeric data |
Green values | String data |
Red values | Boolean data |
Asterisk | Records that include edited field values |
Red italic field labels | Fields that contain edited data |
Light red background | Fields removed by a stage |
Italic values | Edited data |
Green stage | First stage in a multiple-stage preview |
Red stage | Last stage in a multiple-stage preview |
Input and Output Schema for Stages
After running preview for a pipeline, you can view the input and output schema for each stage on the Schema tab in the pipeline properties panel. The schema includes each field path and data type.
- Invoke expression completion for a stage property.
- Click the Select Fields Using Preview Data icon to open the Field Selector dialog box for a stage property.
If you change the schema for a pipeline, for example if you remove a field, rename a field, or change the data type of a field, then you must run preview again so that the schema reflects the change.
In most cases as you configure stage properties, you can use expression completion or the Field Selector dialog box to specify a field path. However, in some cases, you might use the Schema tab to copy a field path.
For example, let’s say you are configuring a Field Type Converter processor to convert
the data type of a field by name. After running preview, you select the Field Type
Converter in the pipeline canvas, and then click the Schema tab in the pipeline
properties panel. You click the Copy Field Path to Clipboard icon
() to copy the field path from the Schema tab, and then paste the field path into the
processor configuration.
The following image displays a sample Schema tab with the time of the last data preview:
Previewing a Single Stage
You can preview data for a single stage. In the Preview panel, you can review the values for each record to determine if the stage transforms data as expected.
Previewing Multiple Stages
You can preview data for a group of linked stages within a pipeline.
When you preview multiple stages, you select the first stage and the last stage in the group. The Preview panel then displays the output data of the first stage in the group and the input data of the last stage in the group.
In the Preview panel, you can review the values for each record to determine if the group of stages transforms data as expected.
Editing Preview Data
You can edit preview data to view how a stage or group of stages processes the changed data. Edit preview data to test for data conditions that might not appear in the preview data set.
For example, when the stage filters integer data based on an expression, you might change the input data to test positive and negative integer values, as well as zero.
- The output data column for an origin.
- The input data column for processors.
When you edit preview data, you can pass the changed data through the pipeline, or you can revert your changes to return to the original data.
Editing Properties
In data preview, you can edit stage properties to see how the changes affect preview data. For example, you might edit the expression in an Expression Evaluator to see how the expression alters data.
When you edit properties, you can test the change with the existing preview data or you can refresh the preview data.