Maintaining Buffer Capacity in Clusters

Qubole has introduced a new enhancement known as buffer capacity. It helps the cluster admins in configuring the cluster with some specified capacity as buffer capacity that is used when a query is submitted on the cluster.

The cluster always contains this buffer (configured) capacity free throughout its lifetime except when the cluster size exceeds or reaches its configured maximum cluster size. The advantage of buffer capacity is that a new command does not have to wait for the cluster to upscale and it can quickly get executed.

Qubole supports this enhancement on Hadoop (Hive), Presto, and Spark clusters.

Note

Hadoop 3 (with Hive 3.1.1 (beta)) clusters do not support this enhancement.

The configuration for Hadoop (Hive)/Spark differ from that of Presto, which are described in:

Configuring Buffer Capacity for Hadoop and Spark Clusters

You can configure a Hadoop (Hive) or Spark cluster to have some specified capacity as buffer capacity.

Note

The configuration is in terms of nodes but the actual reservation is in terms of capacity.

You can pass the following YARN parameters as Hadoop overrides in the cluster’s (UI) Advanced Configuration > HADOOP SETTINGS > Override Hadoop Configuration Variables:

  • yarn.cluster_start.buffer_nodes.count: Set this count to a required value to configure the buffer capacity. For example, if the value is set to 2, then capacity worth of 2 minimum nodes (configured in the cluster) is in buffer (free to be used) during the cluster’s lifetime. So, if the minimum node was set to r4.xlarge (30.5 GB, 4 vCPUs), then (61 GB, 8 vCPUs) is in buffer capacity. Upcoming queries on that specific cluster use this buffer capacity.
  • yarn.autoscaling.buffer_nodes.count.is_dynamic: You can also let Qubole maintain the buffer capacity by setting this configuration property to true in Hadoop overrides.

Setting Advanced Configuration

When yarn.autoscaling.buffer_nodes.count.is_dynamic is set to true, Qubole determines the buffer capacity as defined in the following graphical illustration.

../../_images/BufferSpaceGraph.png

The X-axis represents the current cluster size and Y-axis represents the percentage of the current cluster size to keep in buffer. The calculation of buffer nodes is determined by two points in the graph, (X1, Y1) and (X2, Y2). The default values of these parameters are mentioned in brackets. If cluster size is less than or equal to 10 nodes, Qubole maintains a minimum of 2 nodes in buffer. As the cluster size increases, the buffer percentage decreases as per the given graph from 20 percent to 10 percent. For clusters having more than 100 nodes, Qubole maintains a maximum of 10 nodes in buffer. As per the requirement, these parameters are configured to achieve the desired behavior. The following list maps the graphical points and buffer capacity configuration properties (passed as Hadoop overrides):

  • X1, the X coordinate of point (X1, Y1) - yarn.autoscaling.buffer_nodes.cluster_size.lvalue
  • X2, the X coordinate of point (X2, Y2) - yarn.autoscaling.buffer_nodes.cluster_size.rvalue
  • Y1, the Y coordinate of point (X1, Y1) - yarn.autoscaling.buffer_nodes.buffer_percent.lvalue
  • Y2, the Y coordinate of point (X2, Y2) - yarn.autoscaling.buffer_nodes.buffer_percent.rvalue

If the cluster is configured with buffer nodes and the count of buffer nodes to maintain is greater than the configured minimum cluster size, then cluster comes up with more nodes than minimum number of nodes and the buffer node count determines the additional nodes. The additional nodes (apart from minimum number of nodes) are in the same proportion as determined by the cluster’s spot percent configuration and does not always be On-Demand.

For example, let us consider a cluster configured with:

  • 2 as its minimum cluster size
  • 4 buffer nodes (greater than the minimum cluster size)
  • 50% as the spot percent configuration

After the cluster with above configuration comes up, the cluster automatically upscales to 4 nodes by bringing up 1 spot and 1 On-Demand nodes. The cluster does not downscale below 4 nodes and this acts as the new minimum cluster size.

With buffer capacity in the clusters, a newly submitted query gets quickly executed without waiting for the cluster to upscale.

Configuring Buffer Capacity in Presto Clusters

You can configure Presto clusters to always maintain some specific nodes as buffer capacity.

Use ascm.cluster-start-buffer-workers as the parameter to set its count to a required value for configuring the buffer capacity. For example, if the value (count) is set to 5, then the cluster autoscaling always maintains a capacity of 5 minimum nodes during the cluster’s lifetime.

Note

With this feature enabled, the cluster upscales using buffer capacity as the trigger to upscale as opposed to triggers described in workload-aware Presto autoscaling.

If the value of ascm.cluster-start-buffer-workers is greater than the configured minimum cluster size, then cluster comes up with more nodes than minimum number of nodes and ascm.cluster-start-buffer-workers determines additional nodes. The additional nodes (apart from minimum number of nodes) are in the same proportion as determined by the cluster’s spot percent configuration. The additional nodes are not always On-Demand nodes.

For example, let us consider a cluster configured with (same example that is cited above):

  • 2 as its minimum cluster size
  • 4 buffer nodes (greater than the minimum cluster size)
  • 50% as the spot percent configuration

After the cluster with above configuration comes up, the cluster automatically upscales to 4 nodes by bringing up 1 spot and 1 On-Demand nodes. The cluster does not downscale below 4 nodes and this acts as the new minimum cluster size.

Both query-manager.required-workers as well as ascm.cluster-start-buffer-workers achieve increased spot usage when the corresponding configured values are greater than minimum cluster size. However, with buffer capacity in the clusters, a newly submitted query gets quickly executed without waiting for the cluster to upscale. Whereas with required workers, a newly submitted query waits before executing until its worker nodes requirement is met.