Scheduling YARN ApplicationMasters

You can reserve buffer space on On-Demand nodes in Hadoop (Hive) and Spark clusters to run ApplicationMaster (AM) containers. No task containers can be allocated in this reserved space.

Clusters running Hadoop 3 (with Hive 3.1.1 Beta) do not support this capability.

Configuring buffer space on On-Demand nodes is a good precaution: if an AM is scheduled on a Spot node, and the lifetime of the AM is longer than that of the task containers, the loss of the Spot node can result in job failure or multiple retries.

Note

Make sure that the cluster is not configured to run 100% Spot nodes, or 100% Spot worker nodes.

Configuring Space Reservation for ApplicationMasters

To enable space reservation for AMs on On-Demand nodes, add yarn.scheduler.am-reservation.enabled=true to the Override Hadoop Configuration Variables under Hadoop Cluster Settings under the Advanced Configuration tab of the Clusters page in the QDS UI. When you do this, space for four AMs is reserved by default.

The following configuration properties determine the default size of an AM:

  • yarn.app.am.default-size.mb: This is equal to maximum of (yarn.app.mapreduce.am.resource.mb, tez.am.resource.memory.mb, spark.yarn.am.memory, spark.driver.memory, yarn.app.am.default-size.mb) set at the cluster level.
  • yarn.app.am.default-size.vcores: This has a value of 1 by default.

You can use the above two parameters to adjust the default AM size, and space for four AMs is reserved accordingly.

You can also override the amount of memory and vcores to be reserved for the AMs in absolute terms by using these configuration properties:

  • yarn.scheduler.am-reservation.capacity.mb: Defines the amount of memory to reserve (in MB). Thw default is 4 * (default AM size).
  • yarn.scheduler.am-reservation.capacity.vcores: Defines the number of vCPUs to reserve. The default is 4.

If there is not enough space to schedule AMs on existing On-Demand nodes, by default a cluster will not bring up additional nodes, but you can explicitly configure the cluster to bring up more nodes by means of Hadoop override parameter yarn.scheduler.am-reservation.get_stable_node=true. Note that the Spot percentage of the cluster can go below the configured value in such cases.

Configuring Timeout to Schedule AMs on Spot Nodes (AWS)

You can use yarn.scheduler.qubole.am-on-stable.timeout.ms to set a timeout for scheduling Application Managers (AMs) on Spot nodes. The timeout is set in milliseconds.

By default, QDS does not schedule AMs on AWS Spot nodes. This is because Spot nodes can be lost at any time, and losing the AM for a YARN application can be disastrous. This default behavior is controlled by the default value of yarn.scheduler.qubole.am-on-stable.timeout.ms, which is -1.

The default behavior can change as follows:

  • If you override the timeout default, the ResourceManager (RM) tries, for the number of milliseconds you specify, to schedule the AM on an On-Demand node. If it does not succeed, the RM first tries to schedule the AM on any available stable node, and if that fails it schedules the AM on a volatile Spot node.
  • If the cluster uses 100% Spot instances for autoscaling, and the minimum cluster size is less than 5, the RM never waits to schedule an AM; it immediately schedules it on a Spot node if there is available capacity. Again, this behavior can be overridden: you can configure wait-time by means the above option.
  • If you set the timeout to 0, the RM immediately schedules the AM, considering all nodes (Spot and On-Demand) as candidates.

This parameter can have the values shown in the following table.

Parameter Values Description
-1 The default value; AMs are not scheduled on volatile Spot nodes.
0 RM schedules AMs on volatile Spot nodes whenever possible.
Any other value RM waits for <value> milliseconds before scheduling an AM on a volatile Spot node.