Migrating to Recent Presto Versions

The later versions of Presto support different configuration properties for certain features when compared to that of earlier Presto versions. The following sections describe the properties that are different in recent Presto versions. For more information, see:

Migrating to Presto Version 317

Presto version 317 supports different configuration properties for certain features as compared to earlier Presto versions. So, if you upgrade to Presto version 317 from an earlier Presto version, you must change the configuration properties that are applicable to Presto version 317 to continue using the configured features (with different configuration properties).

The following sections describe changes in the configuration and behavior:

Hive Properties

When you migrate to Presto version 317, these hive properties need to be changed/removed:

  • Remove the hive.parquet-optimized-reader.enabled configuration property and parquet_optimized_reader_enabled session property as they do not exist in Presto version 317 as it uses the optimized parquet reader.

  • Remove the hive.parquet-predicate-pushdown.enabled configuration property and parquet_predicate_pushdown_enabled session property. Pushdown is always enabled in the Presto 317 parquet reader.

  • Replace the hive.bucket-owner-full-control=true configuration property with hive.s3.upload-acl-type=BUCKET_OWNER_FULL_CONTROL.

  • The hive.collect-column-statistics-on-write configuration property for Hive does not work with Presto 317. You might face issues with INSERT commands. Qubole recommends disabling hive.collect-column-statistics-on-write.

  • Remove the hive.metastore-cache-ttl-bulk configuration property as it is not supported with Presto version 317.

Dynamic Filtering

If you have enabled dynamic filtering earlier and migrated to Presto version 317, then replace the experimental.dynamic-filtering-enabled configuration property with experimental.enable-dynamic-filtering on the cluster.

To enable Dynamic Filtering at a session level, replace the dynamic_filtering session property with the enable_dynamic_filtering session property.

Query Properties

These are the two changes:

  • When you are on Presto 317, use show stats for in Presto queries. show stats on does not work in Presto 317.

  • Remove the node-scheduler.optimized-local-scheduling configuration property from Presto version 317 if it was configured in earlier Presto version as the default scheduler is improved to incorporate data locality in the split scheduling decision.

File-based Authentication in Direct Connections

When you migrate to Presto 317, these are the changes for passwords when you authenticate direct connection through file-based authentication:

  • Presto 317 does not support MD5 encrypted passwords, SHA1 encrypted passwords, and Crypt algorithm-based passwords.

  • Presto 317 only supports BCrypt and PBKDF2 encrypted passwords. You can generate Bcrypt password from the command line as:

    htpasswd -nbBC <COST> <USER> <PASSWORD>

    In the above command, replace the COST, USER and PASSWORD with the corresponding values. Note that the value of cost must be equal to or greater than 8.

    PBKDF2 encrypted password is the key derivation function with a sliding computation cost. For generating a PBKDF2 password, refer to Java PBKDF2WithHmacSHA1 Hash Example. Note that the minimum iteration for PBKDF2 must be equal to or greater than 1000.

  • In the /etc/config.properties file, add this configuration.

    http-server.authentication.type=PASSWORD
    
  • In the /etc/password-authenticator.properties file, add this configuration.

    password-authenticator.name=file
    file.password-file=/usr/lib/presto/etc/file_auth
    file.refresh-period=10ms
    file.auth-token-cache.max-size=1000
    

    The minimum value of:

    • file.refresh-period must be 10ms

    • file.auth-token-cache.max-size must be 1000

Data Sources Properties

In Presto version 317, the data sources configuration property is not necessary for configuring connectors. If you had configured it in an existing cluster with Presto 0.208 or earlier versions, remove it from the Presto cluster overrides for the Presto-317 cluster to start successfully.

Migrating from Presto Version 0.193 to 0.208

Presto version 0.208 supports different configuration properties for certain features as compared to Presto 0.193 version. So, if you upgrade the Presto-0.193-cluster to Presto version 0.208, you must change the configuration properties that are applicable to Presto version 0.208 to continue using the configured features (with different configuration properties).

The following sections describe changes in the configuration and behavior:

Query Execution Properties

If you have configured query execution properties on the cluster that is on Presto version 0.193, then while upgrading that cluster to Presto version 0.208, ensure that you configure these properties:

  • query.max-memory-per-node: If you have overridden this property in the Presto-0.193 cluster, then when you upgrade the version on that cluster to Presto 0.208, ensure that it is set to the default value/ any other value as supported in Presto version 0.208. If you have not overridden this property, then the Presto-0.208 cluster would use its default value.

  • query.max-total-memory-per-node: You need to configure this property only if you had overridden query.max-memory-per-node in the Presto-0.193 cluster.

  • memory.heap-headroom-per-node: The Presto version 0.208 has this property configured by default. You can override it later if you want to change its default value.

For more information on these properties, see Understanding the Query Execution Properties. Note that the corresponding session-level properties are different too in Presto version 0.208.

JOIN Reordering

If you have configured qubole-reorder-joins to enable JOIN Reordering on the Presto-0.193 cluster, then while upgrading that cluster to Presto version 0.208 , ensure that you remove qubole-reorder-joins and configure optimizer.join-reordering-strategy.

For more information on these properties, see Specifying JOIN Reordering.

UNNEST Row Types

UNNEST of collections of row types produces multiple columns in Presto version 0.208 as opposed to earlier versions.

Behavior in Presto 0.193 and earlier versions

presto> SELECT * FROM UNNEST(array[row('a', 1), row('b', 2)]);
       _col0
----------------------
 {field0=a, field1=1}
 {field0=b, field1=2}
(2 rows)

Behavior in Presto version 0.208

presto> SELECT * FROM UNNEST(array[row('a', 1), row('b', 2)]);
 _col0 | _col1
-------+-------
 a     |     1
 b     |     2
(2 rows)

Decimal Literals

Decimal literals without explicit type specifier are parsed in Presto version 0.208 as the DECIMAL type by default as compared to being parsed as the DOUBLE type in earlier Presto versions.

Behavior in Presto 0.193 and earlier versions

presto> select 1.0/30;
 _col0
---------------------
 0.03333333333333333
(1 row)

Behavior in Presto version 0.208

 presto> select 1.0/30;
 _col0
 -------
 0.0
(1 row)

You can enable the legacy behavior by adding any one of the following configuration (as required):

  • Presto configuration override on the Presto cluster: parse-decimal-literals-as-double=true

  • Setting the session property as set session parse_decimal_literals_as_double=true;

Alternatively, you can modify Presto queries to explicitly cast decimal literals to double literals as follows.

presto> select cast(1.0 as double)/30;
     _col0
--------------------
0.03333333333333333
(1 row)

Viewing Table Partitions

The SHOW PARTITIONS command is not available from Qubole Presto 0.208 onwards. Partitioned tables in Presto 0.208 and later versions have a hidden system table that contains the partition values.

For example, a table named foo has a partitions table named foo$partitions.

SELECT * FROM "foo$partitions" provides the same functionality and data as SHOW PARTITIONS foo.