Understanding Workload-based Scaling Limits in YARN-based Clusters

Earlier, Qubole’s YARN-level autoscaling required an admin to monitor to resize the cluster as and when the number of users using that specific cluster increased. In addition to this, there is a need to set the maximum limit on the resources an individual user or a cluster workload can use at a given point in time.

Thus to monitor the scaling limit and the cluster size, Qubole provides Workload Scaling Limits, a new feature that manages user/workload resource limits while autoscaling. This helps an admin to just set a default maximum resources limit for a user or an application. As a result, cluster does not scale up beyond that limit. This removes the need for an admin to manually monitor the cluster size when there is an increase in number of users using that cluster.

To enable this feature, set mapred.hustler.upscale.cap.fairscheduler.max_limits to true in the Qubole cluster’s Hadoop overrides. For information on adding an Hadoop override through the UI, see Managing Clusters.

For information on adding an Hadoop override through a REST API call, see hadoop_settings.

You can directly use fair-scheduler.xml to configure the workload scaling limits. This feature works in conjunction with FairScheduler (FS) configurations described in this table.

Parameter

Description

yarn.scheduler.fair.allocation.file

It is the path to allocation file. An allocation file is an XML manifest describing queues and queue properties in addition to certain policy defaults. This file must be in the XML format. If a relative path is given, the file is searched for on the classpath (which typically includes the Hadoop’s conf directory). It defaults to fair-scheduler.xml.

yarn.scheduler.fair.user-as-default-queue

It denotes whether to use the username associated with the allocation as the default queue name when a queue name is not specified. If it is set to false or unset, all jobs have a shared default queue named default. It defaults to true. If a queue placement policy is given in the allocations file, this property is ignored.

yarn.scheduler.fair.allow-undeclared-pools

This parameter defaults to true. When it is set to true, new queues are created at an application’s submission time because queues are specified as the application’s queue by the submitter or because queues are placed there by the user-as-default-queue property. If this parameter is set to false, any time an application would be placed in a queue that is not specified in the allocations file and placed in the default queue instead. If a queue placement policy is given in the allocations file, this property is ignored.

Understanding the Resource Allocation in a FairScheduler

It is recommended to set the root queue’s maxResources value to a large value. Otherwise, the default maximum limit (queueMaxResourcesDefault) is considered as the root queue’s maxResources, which limits the cluster’s upscaling beyond that maximum value. It is specifically applicable to the case where a certain user’s applications or jobs are submitted to their own queues. If you do not set the root queue’s maxResources, the cluster’s upscaling does not occur as desired, which ultimately deprives the cluster resources for such applications or jobs.

Let us consider a sample FairScheduler configuration as given here.

<allocations>
  <queueMaxResourcesDefault>12000 mb, 2 vcores</queueMaxResourcesDefault>
     <clusterMaxAMShare>0.67</clusterMaxAMShare>
          <!-- Set root queue maxResource definition to a large value if jobs of different users are going to have
          their own queue, otherwise queueMaxResourcesDefault would be considered as the root queue's maxResources.-->
      <queue name="root">
         <maxResources>1000000 mb, 100000 vcores</maxResources>
      </queue>
      <queue name="etl">
      <queue name="prod">
        <maxResources>45000 mb, 5 vcores</maxResources>
      </queue>
     <queue name="dev">
       <maxResources>16000 mb, 3 vcores</maxResources>
      </queue>
  </queue>
</allocations>

In the above FS configuration, default maximum resources limit set for a queue is 12000 mb 2 vcores. A new user or an application that goes into its own queue cannot consume resources more than the maximum resources limit. Therefore, autoscaling would not occur beyond the application/user’s maximum resources limit.

Admins can have custom queues set for different workloads or users having different maximum limits configured by modifying the fair-scheduler.xml. Thus, the admin can set resource limits for individual workloads as well.