Spark Dynamic Allocation Prerequisite
Before you run a pipeline on a MapR cluster, you must set up Spark dynamic allocation on the cluster.
MapR provides a blog post that describes how to perform this task. Perform all of the steps described in the post, with the following change.
At this time, the "Enabling Dynamic Allocation in Apache Spark" section says to add the following entries to the /opt/mapr/spark/spark-1.6.1/conf/spark-defaults.conf file:
spark.dynamicAllocation.enabled = true spark.shuffle.service.enabled = true spark.dynamicAllocation.minExecutors = 5 spark.executor.instances = 0
Setting spark.executor.instances
to 0
generates an
error. Instead, set spark.executor.instances
to 1
or
higher, up to the maximum number of executors allowed in the Transformer
instance.