Presto Metrics on the Default Datadog Dashboard

Qubole Presto supports Datadog monitoring and it also supports metrics on Datadog dashboards.


The feature to use the Datadog UI is not available by default. Create a ticket with Qubole Support to enable this feature on the QDS account.

When Datadog monitoring is configured on a Presto cluster, the metrics of an active cluster are displayed on a default Datadog dashboard. The default Datadog dashboad metrics are:

  • presto.jmx.gc.minor_collection_time
  • presto.jmx.avg_planning_time
  • presto.jmx.qubole.workers
  • presto.jmx.qubole.request_failures
  • presto.jmx.running_queries
  • presto.jmx.completed_queries
  • presto.jmx.failed_queries


Understanding the Presto Metrics for Monitoring provides more details on the metrics and the actions that you can do to remove the cause of errors.

In addition to the default Presto metrics that Qubole sends to Datadog, you can also send other Presto metrics to Datadog. Qubole uses Datadog’s JMX agent through jmx.yaml configuration file in its Datadog integration. It uses 8097 as the JMX port. This enhancement is available for a beta access and it can be enabled by creating a ticket with Qubole Support.

As a prerequisite, you must enable Datadog monitoring on the Presto cluster.

Enabling Datadog

Advanced configuration: Modifying Cluster Monitoring Settings describes how to enable Datadog through the cluster UI. Add Datadog API and APP tokens in the Advanced Configuration of the Presto cluster. Create a New Cluster describes how to configure Datadog through a API call.

Here is an example that illustrates Datadog tokens on the cluster UI.


You can enable Datadog monitoring in Control Panel > Account Settings which would apply the settings on all clusters of that account. For information on enabling Datadog at account level, see Configuring your Access Settings using IAM Keys or Managing Roles.

Viewing the Default Datadog Dashboard

After enabling Datadog on the QDS account/cluster, the Datadog metrics related to Presto are displayed on the Datadog UI. For example, run a Presto query on the QDS UI (or API).

Here is an example of a Presto query.


Log into the Datadog and navigate to Dashboards. You can find the Presto dashboards in the list. Here is an illustration of the Datadog dashboards.


Click the default Datadog which is named with this convention - Account <account owner> Cluster <label> (<cluster ID>). You can see the default Datadog metrics. Here is an example of the Presto metrics on the default Datadog dashboard.


Default Alerts as Set by QDS

Qubole has set these alerts by default:

  • If the master CPU utilization goes beyond 80%, then you receive an alert.
  • If the the presto.jmx.avg_planning_time is greater than 2 minutes.

If you want to customize the threshold values or alerts about other metrics, you can set such alerts/values. For information on how to create alerts and configure email notifications, see the Datadog Alerts description.

Here is an example of the master CPU utilization alert.


Understanding the Presto Metrics for Monitoring provides more details on the metrics and the actions that you can do to remove the cause of errors.