Presto

The new features and enhancements are:

Other enhancements and bug fixes are listed in:

AWS Glue is Supported as a Hive Metastore

QHIVE-4160: Qubole supports using AWS Glue as the primary Hive metastore in Presto. It is only supported on Presto 0.208. Via Support

In addition, it supports syncing Hive Metastore with the AWS Glue catalog as described in AWS Glue Catalog Sync.

Resource Groups-based Dynamic Cluster Sizing in Presto

PRES-2265: Qubole has introduced dynamic sizing of Presto clusters based on resource groups. Users are assigned to Presto resource groups and each resource group has a configurable limit on the maximum nodes that it can scale the cluster upto independently.

The maximum cluster size is calculated dynamically based on the active resource groups and their scaling limits. The feature is only applicable to Presto 0.208 and you can set it at the cluster level. Cluster Restart Required

You can enable this feature at the account level only Via Support.

For more information, see the documentation.

Configuring Required Number of Worker Nodes during Cluster Autoscaling

PRES-1350: Qubole supports configuring the required number of worker nodes during autoscaling. It is a cluster configuration override, query-manager.required-workers. You can set it to denote the number of worker nodes that must be in the cluster before a query is scheduled to be run on the cluster. This enhancement is only supported with Presto 0.193 and later versions. Cluster Restart Required

For more information, see Configuring the Required Number of Worker Nodes.

Controlling the Downscaling Velocity in Clusters

PRES-2521: Qubole has added a cluster configuration property, ascm.downscaling.staggered for controlling the downscaling velocity in Presto clusters. Enabling this configuration results in a linear downscaling profile which can be a better choice for unpredictable workloads without well-defined peak and lean periods. Cluster Restart Required

For more information, see Controlling the Nodes’ Downscaling Velocity.

Changes in Presto Versions

  • PRES-2684: Presto version 0.208 is generally available now. Cluster Restart Required

  • PRES-2598: Presto version 0.157 is marked as deprecated in the Clusters UI. Presto 0.193 is the default Presto version now.

Presto Ranger Integration is Generally Available

PRES-2470: Ranger integration with Presto is generally available now. This includes support for Ranger Admin over SSL and several other bug fixes.

Support for Join Reordering and Join Distribution Type Determination Based on Table Size

Presto on Qubole has introduced the notion of estimating table statistics on the basis of the table’s size on the storage layer. Beta

This estimate can currently be used to determine the JOIN distribution type (PRES-2029) and reordering of tables (PRES-43) in a multi-JOIN scenario. For more information, see the documentation.

Presto Clusters Support Heterogeneous Nodes

Presto clusters now support heterogeneous nodes. Via Support

Learn more on Presto clusters with heterogeneous nodes here.

Enhancements

  • PRES-1143: The GET /api/v1.2/commands/<Command-ID>/error_logs API call is now available for Presto and SQL commands, which returns the error logs for a failed command.

  • PRES-2267: These are the changes to the memory pool configuration that holds good to Presto 0.208:

    • The JVM heap size for worker nodes has been increased from 70% to 80% of the instance memory.

    • The default value of query.max-memory-per-node and query.max-total-memory-per-node is 30% of the JVM heap size.

    • The default value of memory.heap-headroom-per-node is 20% of the JVM heap size. This results in 30% of the heap for a reserved pool, 20% heap headroom for untracked memory allocations and the remaining 50% of the heap for the general pool.

  • PRES-2397: Qubole supports escaping newline \n and carriage return (\r) characters in data for correctly parsing on the QDS UI. This enhancement is not available by default and it is only supported with Presto 0.193 and later versions. Via Support

  • PRES-2417: Presto clusters do not terminate while actively running Presto notebook paragraphs. This is to avoid any such instance. The enhancement is not available by default. Via Support

  • PRES-2444: The QDS cluster API supports adding and gracefully removing a node from a Presto cluster.

  • PRES-2451: In addition to the default Presto metrics that Qubole sends to Datadog, you can also send other Presto metrics to Datadog. Qubole uses Datadog’s JMX agent through jmx.yaml configuration file in its Datadog integration. It uses 8097 as the JMX port. Beta, Via Support

  • PRES-2474: The optimization to speed up queries on system.jdbc.tables with filter on a single table name. This speeds up the metadata extract in Business Intelligence tools such as DBeaver, which query system.jdbc.tables with filter on a single table name.

  • PRES-2638: Qubole allows you to enable the Strict Mode feature at the account level now only Via Support.

    It is used to restrict queries that can be ran. You can configure this feature at the cluster level as a Presto override.

    When this feature is enabled, it supports a semicolon-separated list of values below:

    • MANDATORY_PARTITION_CONSTRAINT

    • DISALLOW_CROSS_JOIN

    • LIMITED_SORT

    For example, qubole-strict-mode-restrictions=LIMITED_SORT;MANDATORY_PARTITION_CONSTRAINT allows queries to sort only limited data and queries which have a PREDICATE over partitioned tables.

  • PRES-2656: Qubole supports enabling optimizer.optimize-single-distinct optimizer at an account level, which is enabled only Via Support.

    This optimizer speeds up GROUP BY queries having multiple aggregation functions along with a single count-distinct function. You can set this property at the cluster-level as a Presto override.

Bug Fixes

  • PRES-1924: It fixes issues related to the connection timeouts from the Ruby client to the Presto coordinator. This is the timeout message: Connection refused - connect(2) that was displayed.

  • PRES-1968: Fixed the issue in Presto queries which failed with the nesting of 101 is too deep error.

For a list of bug fixes between versions R55 and R56, see Changelog for api.qubole.com.