Validation

Transformer performs two types of validation:

Implicit validation: Implicit validation occurs by default as the Control Hub UI saves your changes. Implicit validation lists missing or incomplete configuration, such as an unconnected stage or a required property that has not been configured.; Errors found by implicit validation display in the Validation Errors list. Error icons display on stages with undefined required properties and on the canvas for pipeline issues.
Explicit validation: Explicit validation occurs when you click the Validate icon, run a preview, or start a test run of the pipeline. Explicit validation becomes available when implicit validation passes.; Explicit validation is a semantic validation that checks all configured values for validity and verifies whether the pipeline can run as configured. For example, while implicit validation verifies that you entered a value for a URI, explicit validation tests the validity of the URI by connecting to the system.; Errors found by explicit validation display in a list from the validation error message.

Spark for Explicit Validation

When you click the Validate icon to perform an explicit validation of a pipeline, you choose the Spark libraries that Transformer uses to validate the pipeline:

Embedded Spark libraries: Transformer includes embedded Spark libraries that you can use to validate a local or cluster pipeline.; When you validate using the embedded Spark libraries, Transformer validates the pipeline without communicating with the Spark installation on the local Transformer machine or on the cluster.; Validation using the embedded Spark libraries typically completes quickly. However, the validation fails if the Transformer machine cannot access the external systems that the pipeline connects to.
Spark cluster configured for the pipeline: Transformer validates the pipeline using the Spark cluster configured for the pipeline.; When you validate a local pipeline using the configured Spark cluster, Transformer launches a Spark application in the local Spark installation on the Transformer machine, and then performs the validation in the local Spark installation.; When you validate a cluster pipeline using the configured Spark cluster, Transformer launches a Spark application in the configured cluster, and then performs the validation on the Spark cluster.; When you use the configured cluster, Transformer performs the same validation as when you start the pipeline. However, using the configured cluster can cause the validation to take longer.

In most cases, you'll want to validate a pipeline using the configured Spark cluster so that Transformer uses the same validation as when you start the pipeline.