Using the Presto Query Retrying Mechanism

Qubole has added a query retry mechanism to handle query failures (if possible). It is useful in cases when Qubole adds nodes to the cluster during autoscaling or after a spot node loss (that is when the cluster composition contains Spot nodes). The new query retry mechanism:

  • Retries a query which is failed with the LocalMemoryExceeded error when the new nodes are added to the cluster or in the process of being added to the cluster.
  • Retries a query which failed with an error due to the worker node loss.

In the above two scenarios, there is a waiting time period to let new nodes join the cluster before Presto retries the failed query. To avoid an endless waiting time period, Qubole has added appropriate timeouts. Qubole has also ensured that any actions performed by the failed query’s partial execution are rolled back before retrying the failed query.

Uses of the Query Retry Mechanism

The query retry mechanism is useful in these two cases:

  • When a query triggers upscaling but fails with the LocalMemoryExceeded error as it is run on a smaller-size cluster. The retry mechanism ensures that the failed query is automatically retried on that upscaled cluster.
  • When a Spot node loss happens during the query execution. The retry mechanism ensures that the failed query is automatically retried when new nodes join the cluster (when there is a Spot node loss, Qubole automatically adds new nodes to stabilize the cluster after it receives a Spot termination notification. Hence, immediately after the Spot node loss, a new node joins the cluster).

Enabling the Query Retry Mechanism

You can enable this feature at cluster and session levels by using the corresponding properties:

  • At the cluster level: Override retry.autoRetry=true in the Presto cluster overrides. On the Presto Cluster UI, you can override a cluster property under Advanced Configuration > PRESTO SETTINGS > Override Presto Configuration.

  • At the session level: Set auto_retry=true in the specific query’s session.

    Note

    The session property is more useful as an option to disable the retry feature at query level when autoRetry is enabled at the cluster level.

Configuring the Query Retry Mechanism

You can configure these parameters:

  • retrier.max-wait-time-local-memory-exceeded: It is the maximum time to wait for Presto to give up on retrying while waiting for new nodes to join the cluster, if the query has failed with the LocalMemoryExceeded error. Its value is configured in seconds or minutes. For example, its value can be 2s, or 2m, and so on. Its default value is 5m. If a new node does not join the cluster within this time period, Qubole returns the original query failure response.
  • retrier.max-wait-time-node-loss: It is the maximum time to wait for Presto to give up on retrying while waiting for new nodes to join the cluster if the query has failed due to the Spot node loss. Its value is configured in seconds or minutes. For example, its value can be 2s, or 2m, and so on. Its default value is 3m. If a new node does not join the cluster within this configured time period, the failed query is retried on the smaller-sized cluster.
  • retry.nodeLostErrors: It is a comma-separated list of Presto errors (in a string form) that signify the node loss. The default value of this property is "REMOTE_HOST_GONE","TOO_MANY_REQUESTS_FAILED","PAGE_TRANSPORT_TIMEOUT".

Understanding the Query Retry Mechanism

The query retries can occur multiple times. By default, three retries can occur if all conditions are met. The conditions on which the retries happen are:

  • The error is retryable. Currently, LocalMemoryExceeded and node loss errors: REMOTE_HOST_GONE, TOO_MANY_REQUESTS_FAILED, and PAGE_TRANSPORT_TIMEOUT are considered retryable. This list of node loss errors is configurable using the retry.nodeLostErrors property.
  • Only INSERT OVERWRITE DIRECTORY queries are considered retryable.
  • The actions of a failed query are rolled back successfully. If the rollback fails or if Qubole times out waiting, then it does not retry it.
  • A failed query has a chance to succeed if retried:
    • For the LocalMemoryExceeded error: The query has a chance to succeed if the current number of workers is greater than the number of workers handling the Aggregation stage. If Qubole times out waiting to get to this state, it does not retry.
    • For the node loss errors: The query has a chance to succeed if the current number of workers is greater than or equal to the number of workers that the query ran on earlier. If Qubole times out waiting to get to this state, it goes ahead with the retry as the query may still pass in the smaller-sized cluster.