Spot Rebalancing in Presto¶
Spot Rebalancing is supported in Presto. This helps in scenarios when the spot ratio of a running cluster falls short of the configured spot ratio due to unavailability or frequent terminations of spot nodes. The Spot rebalancer ensures that the cluster proactively recovers from this shortfall and it brings the cluster to a state where its spot ratio is as close as possible to its configured value.
By default, after every 30 minutes, Qubole inspects the spot ratio of the cluster and attempts a rebalancing if the spot
ratio falls short of the configured spot ratio. The time period for the spot ratio inspection is configurable using the
An example of using this configuration is setting
ascm.node-rebalancer-cooldown-period=1h in the Presto cluster
overrides. If this example setting is used, Qubole inspects for a skewed spot ratio every hour instead of 30 minutes.
Using very small values for
ascm.node-rebalancer-cooldown-period can lead to an instability in the cluster’s
state. This feature is only applicable to the aggressive downscaling feature, which must be enabled in a Qubole account.
For more information, see Understanding Aggressive Downscaling in Clusters (AWS).
Spot Rebalancing Advanced Configuration Properties¶
These are the two advanced configuration properties:
ascm.sizer.max-cluster-size-buffer-percentage: While rebalancing a running cluster, Qubole tries to gracefully replace the additional running On-Demand nodes. In that process, the cluster may have to add some nodes beyond its maximum size. This configuration controls the maximum limit you can go beyond the cluster’s maximum size while rebalancing. The default value for this configuration property is 10.
For example, consider
ascm.sizer.max-cluster-size-buffer-percentage=20, which means that the cluster size does not exceed beyond 20% of the maximum cluster size while rebalancing.
ascm.node-rebalancer-max-extra-stable-nodes.percentage: This configuration property decides the amount of skew in the spot ratio of running nodes that is allowed in the cluster. If the skew percentage is exceeds this configuration property’s value, Qubole attempts on rebalancing the cluster nodes to conform to the configured spot ratio. The default value for this configuration property is 10.
For example, consider
ascm.node-rebalancer-max-extra-stable-nodes.percentage=15, which means that the cluster nodes are rebalanced only if the skew in the spot ratio of running nodes exceeds 15%.