Presto

New Features and Enhancements

  • PRES-2774, PRES-2060: Encrypted Azure AD key/token support for Azure Gen2 in Presto.

  • PRES-2372: Cost-based optimization (CBO) for JOIN reordering and JOIN distribution type selection, using statistics in the Hive metastore, is enabled by default for Presto version 0.208.

    The following values have been added to the default cluster configuration for Qubole Presto version 0.208.

    optimizer.join-reordering-strategy=AUTOMATIC
    join-distribution-type=AUTOMATIC
    join-max-broadcast-table-size=100MB
    
  • PRES-2256: Presto Decimal Coercion support is available in QDS Presto version 0.208

  • PRES-2695: QDS allows you to override the required number of workers feature’s cluster-level properties, query-manager.required-workers-max-wait and query-manager.required-workers at the query level using the corresponding session-level properties required_workers_max_wait and required_workers.

  • PRES-2918: A new experimental configuration property experimental.reserved-pool-enabled has been added to Presto version 0.208 to allow you to disable the Reserved Pool. The Reserved Pool prevents deadlocks when memory is exhausted in the General Pool; the largest query is promoted to the to Reserved Pool. But only one query is promoted and the remaining queries in the General Pool are blocked state whenever the pool is full. To avoid this, you can set experimental.reserved-pool-enabled to false thereby disabling the Reserved Pool. For more information, see Disabling Reserved Pool.

  • PRES-2657: The path for spill-to-disk functionality, experimental.spiller-spill-path=/media/ephemeral0/presto/spill_dir, has been configured by default in Qubole Presto 0.208. This allows you to use spill-to-disk easily, either by setting set session spill_enabled=true for individual queries, or adding experimental.spill-enabled=true to the Presto cluster configuration override to enable spill-to-disk for all queries.

  • PRES-111: Added a procedure call to clear stale Hive metastore caches. Useful when metastore updates might have occurred from outside the Presto cluster. The command is supported only in Presto version 0.208.

  • PRES-2742: Push configurations for Presto cluster use a REST API call instead of SSH. Via Support, Cluster Restart Required.

  • PRES-2744: New session property qubole_max_raw_input_datasize=1TB limits the total bytes scanned. Queries that exceed this limit fail with the RAW_INPUT_DATASIZE_READ_LIMIT_EXCEEDED exception. This ensures rogue queries do not run for a very long time.

  • PRES-2790: Performance improvement in queries involving IN and NOT IN over a subquery. See this blog post.

  • PRES-2605 Added a new scheduler to optimally schedule tasks according to where Rubix caches the data. See `https://www.qubole.com/blog/presto-rubix-scheduler-improves-cache-reads/`__.

  • PRES-111: Added call hive.default.clear_cache() procedure call to clear stale hive metastore caches. Useful when metastore updates might have occurred from outside the Presto cluster.

  • PRES-2584: Improved smart query retry to support INSERT OVERWRITE TABLE, CREATE TABLE AS and SELECT queries which failed without returning any data. Tracking of query retries has been improved in command logs with Query Tracker links for retries.

  • JDBC-124: QDS now supports concurrent multiple statements in Presto FastPath.

  • PRES-2510: Choosing the Presto UI from the QDS Control Panel redirects to <base-url>/presto-ui-<cluster-id>/ui/. It also redirects <coordinator>:dns:8081 to a static resource <base-url>/ui/index.html.

  • PRES-2992: QDS adds presto-tpcds, presto-localfile, and presto-thrift connectors to Presto 0.193 and 0.208 versions.

Bug Fixes

  • PRES-2568: Fixes a problem that causedc a carriage return \r to be incorrectly added wherever there was a semicolon in a query.

  • PRES-2810: Fixes a problem that caused failures in query planning when dynamic filtering is enabled.