Presto

The new features and key enhancements are:

Other enhancements and bug fixes are listed in:

Presto 317 is Generally Available

PRES-3429: Presto version 317 is generally available. Cluster Restart Required

Dynamic Concurrency and Hybrid Autoscaling

PRES-3373: Changes to workload-aware autoscaling include dynamic concurrency and queue-aware autoscaling in conjunction with CPU-based autoscaling. Automated workload management and related changes improve performance, reliability, and TCO. Gradual Rollout | Cluster Restart Required

Enhancements in Presto for JDBC and ODBC Drivers

Enhancements in Presto for next generation (v3) JDBC and ODBC drivers are designed to make these drivers as fast as open source drivers and to:

  • Support cluster lifecycle management (auto start cluster when a query is submitted and auto terminate idle clusters)
  • Provide query history available in Analyze and Workbench UI
  • Provide enhanced security (HTTPs) and user authentication (through API token)

Improvements in Dynamic Filtering

PRES-3288: Dynamic filtering (DF) improvements include the following:

  • PRES-3002: A new configuration property, hive.max-execution-partitions-per-scan, limits the maximum number of partitions that a table scan is allowed to read during query execution. Disabled | Cluster Restart Required
  • PRES-3148: Extends DF optimization to semi-joins to take advantage of a selective build side in queries with the IN clause.
  • PRES-3149: Pushes dynamic filters down to ORC and Parquet readers to reduce data scanned on the probe side for partitioned as well as non-partitioned tables. Cluster Restart Required
  • PRES-3404: Improves utilization of dynamic filters on worker nodes and reduces the load on the coordinator when dynamic filtering is enabled.

Improvements in Reading Hive ACID Tables

  • PRES-2840: Because Hive 2.0-versioned ACID transactional tables are not supported in Presto 317, QDS has added checks to fail queries using such tables.
  • PRES-3320: QDS has added checks to fail Presto queries on Hive ACID tables when the Hive metastore server’s version is older than 3.0.

Changes in Datadog Alerts

Qubole has added these Datadog alerts:

  • PRES-3360:Adds a Datadog alert to detect runaway splits occupying execution slots for more than 10 minutes, removes the presto.jmx.qubole.request_failures metric from the default Datadog dashboard, and removes the Datadog alert for CPU utilization over 80%.
  • PRES-3468: Adds a Datadog alert to detect if the Coordinator Average Heap Memory Usage is more than 90%.
  • PRES-3508: Adds a Datadog alert to detect if the coordinator’s Presto server open file descriptor has exceeded its limit.

Enforcing Group Quotas in Resource Group-based Dynamic Cluster Sizing

PRES-3194: In resource-based dynamic cluster sizing, QDS now enforces individual resource group quotas for CPU resources even when the cluster autoscales to the union of two resource group quotas.

Enhancements

  • PRES-2958: Provides a procedure for the Hive connector to clear a table’s cache. Depending on your Presto version, use one of these procedures to clear the cache for a given table:

    • Presto 0.208: catalogName.default.clear_table_cache('schema_name','table_name')
    • Presto 317: catalogName.system.clear_table_cache('schema_name','table_name')
  • PRES-3257: Presto now supports removing unhealthy nodes on the basis of disk usage. The coordinator node periodically monitors disk usage on worker nodes and gracefully shuts down nodes that have exceeded a threshold that defaults to 0.9. You can change the threshold value by means of the ascm.bad-node-removal.disk-usage-max-threshold parameter; the supported range is 0.0 - 1.0. Beta | Cluster Restart Required

  • PRES-3273: Improvements in Presto Ranger integration:

  • PRES-3307: Presto on Qubole authenticates Presto REST API endpoints when SSL is enabled. The inter-node communication between the coordinator and worker nodes is authenticated only when SSL is enabled in Presto version 0.208. But in Presto version 317, the communication between the coordinator and worker nodes is authenticated even when SSL is disabled. These changes are backported into Presto versions 0.208 and 317 from the latest open-source Presto version.

    For more information, see the documentation.

  • PRES-3353: QueryHistID is now returned as part of the error message for queries executed through cloud-agnostic drivers if show_on_ui is set to true for these drivers. QueryHistID is useful in debugging. Qubole plans to provide cloud-agnostic drivers shortly.

  • PRES-3469: Backports open-source fixes to improve the performance of inequality JOINs that involve BETWEEN and GROUP BY queries.

Bug Fixes

  • PRES-1799: Presto now returns the number of files written during an INSERT OVERWRITE DIRECTORY (IOD) query in QueryInfo. The Presto client in the QDS Control Plane waits for this information to display the returned number of files at the IOD location. This fixes eventual consistency issues in reading query results through the QDS UI.
  • PRES-3411: Fixes the UnsupportedOperationException that occurred in certain multi-join queries with dynamic filtering enabled.
  • PRES-3513: Fixes a problem that caused an error (Equi criteria are empty, so INNER join should not have PARTITIONED distribution type) during the planning phase for certain queries involving multiple joins when dynamic filtering was enabled.
  • PRES-3544: Fixes a problem that caused dynamic filtering not to work on SSL-enabled clusters.