Join Reordering and Join Distribution Type Determination Based on Table Size

Presto version 0.208 on Qubole now has the ability to do stats-based determination of JOIN-distribution-type (between BROADCAST and PARTITIONED); back ported from the Presto open source version 0.210. It means that if table statistics are present, Presto’s cost-based-optimizer tries to optimise a query plan by choosing the right type of JOIN implementation on the basis of memory, CPU and network cost for every JOIN node in the plan.


This feature is part of Gradual Rollout and it is only available for Hive tables.

The size is calculated only for partitions that are being queried. Further, the distribution type of JOINS in a query is also visible in the Presto query info under the joinDistributionStats key name.

For using the automatic JOIN type determination, you can set:

  • join-distribution-type=AUTOMATIC in at the cluster level
  • join_distribution_type=AUTOMATIC in session properties


This feature cannot be used if the property distributed-join is already set in the session or cluster

This feature requires table statistics to be present for the cost based optimiser to determine join distribution type. Further, the statistics should be maintained and updated as the table evolves for correct estimates. To address the collection and maintenance of the statistics, Qubole provides an Automatic Stats Collection framework.

Further, Qubole has also introduced the notion of estimating table statistics on the basis of the table’s size on the storage layer. It is useful in cases where statistics for tables are not available. If Presto on Qubole is unable to find table statistics, it can fetch the size of the table on the storage layer and estimate the size and the number of rows in the table. This estimate can currently be used to determine the JOIN distribution type and reordering of tables in a multi-JOIN scenario.

For enabling table size based stats to determine join distribution type, you can set:

  • At the cluster level under Advanced Configuration > PRESTO SETTINGS > Override Presto Configuration in the cluster’s UI.
  • In a query, as session properties


For enabling table size-based stats to determine JOIN order, you can set:

  • At the cluster level under Advanced Configuration > PRESTO SETTINGS > Override Presto Configuration in the cluster’s UI.
  • In a query as session properties


Configuring Timeout for Fetching Hive Table Size

hive.table-size-listing-timeout is the property that you can use to set the timeout for listing Hive table sizes. The determination of a table size on the storage layer is performed through a listing on the table’s storage location. The listing operation is bound by a timeout to avoid any significant delays in the query execution time. This configuration controls the timeout for the listing operation.

An example usage of this configuration is adding hive.table-size-listing-timeout=2s to the Hive catalog properties. It would mean that the listing operation on a table’s storage location is bound to complete in 2 seconds. If it does not finish within the timeout, the table is considered to be very large.