Cluster Compatibility Matrix
The following matrix shows the Transformer Scala version that is required for supported cluster and underlying Spark versions.
You can use this matrix to determine the Transformer engine version to use in your deploymentinstallation package to install.
Cluster Type | Supported Cluster Versions | Cluster Underlying Spark Version | Transformer Scala Version |
---|---|---|---|
Amazon EMR | 5.20.0 or later 5.x | 2.4.x | Scala 2.11 |
6.1 and later 6.x | 3.x | Scala 2.12 | |
Azure HDInsight | 4.0 | 2.4.x | Scala 2.11 |
Databricks | 5.x - 6.x | 2.4.x | Scala 2.11 |
7.x | 3.0.1 | Scala 2.12 | |
8.x | 3.1.1 | Scala 2.12 | |
Dataproc | 1.3 | 2.3.4 | Scala 2.11 |
1.4 | 2.4.7 | Scala 2.11 | |
Hadoop YARN 1 CDH distribution |
5.9.x and later 5.x 2 6.1.x and later 6.x |
2.3.0 and later 2.x | Scala 2.11 with Java JDK 8 |
Hadoop YARN 1 Hortonworks distribution |
3.1.0.0 | 2.3.0 and later 2.x | Scala 2.11 |
Hadoop YARN 1 MapR distribution 3 |
6.1.0 | 2.3.0 and later 2.x | Scala 2.11 |
Microsoft SQL Server 2019 Big Data Cluster | SQL Server 2019 Cumulative Update 4 or later | 2.3.0 or later 2.x | Scala 2.11 |
Spark Standalone 4 | NA | NA | Any |
1 Before using a Hadoop YARN cluster, create the required directories and update drivers on older distributions, as needed.
2 Before using CDH version 5.x.x, you must install CDS Powered by Apache Spark version 2.3 Release 3 or higher on the cluster.
3 Before using MapR, complete the prerequisite tasks.
4 Spark Standalone clusters are supported for development workloads only.