Understanding the Presto Engine Configuration

The cluster settings page has a text box labelled Override Presto Configuration which you can use to customize a Presto cluster. An entry in this box can have multiple sections; each section should have a section title, which serves as the relative pathname of the configuration file in the etc directory of Presto, followed by the configuration. You can configure JVM settings, common Presto settings and connector settings from here. You can learn more about these sections in Presto’s official documentation. Here is an example custom configuration:

jvm.config:
-Xmx10g

config.properties:
ascm.enabled=false

catalog/hive.properties:
hadoop.cache.data.enabled=false

Some important parameters for each configuration are covered in the following sections.

jvm.config

These are populated automatically and generally do not require custom values. These are used while launching Presto server JVM.

Parameter Example Default Description
-Xmx -Xmx10g 70% of Instance Memory 10 GB for JVM heap
-XX:+ExitOnOutOfMemoryError false true Because an OutOfMemoryError typically leaves the JVM in an inconsistent state, Qubole forcibly terminates the process when it occurs by setting this JVM property.
-Djdk.nio.maxCachedBufferSize 2097900 2097152 This property limits the amount of native memory used for NIO buffers by setting its default value. This prevents increase in the non-heap memory usage for the JVM process. Its value is set in bytes.

Presto Configuration Properties

The config.properties are described in the following section.

Understanding the Autoscaling Properties

Parameter Examples Default Description
ascm.enabled ascm.enabled ascm.enabled true, false true, false true, false true true true Use this parameter to enable autoscaling. Use this parameter to enable autoscaling. Use this parameter to enable autoscaling.
ascm.upscaling.enabled true, false true Use this parameter to enable upscaling.
ascm.downscaling.enabled true, false true Use this parameter to enable downscaling.
ascm.bds.target-latency 1m, 50s 1m You can set time interval to change the target latency for the jobs. Increasing it makes autoscaling less aggressive.
ascm.bds.interval 10s, 1m 10s The periodic interval set after which reports are gathered and processed to find out the cluster’s optimal size.
ascm.downscaling.trigger.under-utilization-interval 5m, 45s 5m The time interval during which all cycles of reports’ processing must suggest the cluster to scale down to actually scale down the cluster. For example, when this interval is set to 5m, it means that only during an interval of 5 minutes, when all reports suggest that cluster is being under-utilized, would the scaling logic decide to initiate down-scaling. This safeguards against temporary blips which would cause downscaling.
ascm.downscaling.group-size 5, 8 5 Down-scaling in steps and the value indicates the number of nodes that are removed per cycle of down-scaling.
ascm.upscaling.trigger.over-utilization-interval 4m, 50s value of ascm.bds.interval The time interval during which all cycles of reports’ processing must suggest the cluster to scale up to actually scale up the cluster.
ascm.upscaling.group-size 9, 10 Infinite Upscaling in steps and the value indicates the number of nodes that are added per cycle of up-scaling (capped by the maximum size set for the cluster).
query-manager.required-workers 4, 6 NA It is to set the number of worker nodes that must be present in the cluster before a query is scheduled to be run on the cluster. A query is scheduled only after the configured query-manager.required-workers-max-wait timeout. This is only supported with Presto 0.193 and later versions. For more information, see Configuring the Required Number of Worker Nodes.
query-manager.required-workers-max-wait 7m, 9m 5m It is the maximum time a query can wait before getting scheduled on the cluster if the required number of worker nodes set for query-manager.required-workers could not be provisioned. For more information, see Configuring the Required Number of Worker Nodes.

Understanding the Query Execution Properties

Parameter Examples Default Description
query.max-concurrent-queries 2000 1000 It denotes the number of queries that can run in parallel.
query.max-execution-time 20d, 45h 100d

It denotes the time limit on the query execution time. It considers the time only spent in the query execution phase. The default value is 100 days. This parameter can be set in any of these time units:

  • Nano seconds denoted by ns
  • Microseconds denoted by us
  • Milliseconds denoted by ms
  • Seconds denoted by s
  • Minutes denoted by m
  • Hours denoted by h
  • Days denoted by d

Its equivalent session property is query_max_execution_time, which can also be specified in any of the time units given above.

query.max-memory-per-node 10GB, 20GB 28% of Physical Memory Maximum memory that a query can take up on a node. If the value is set more than 42% of Physical Memory, cluster failures occur. 40% of Heap is reserved for the system memory pool. You can allocate the remaining 60% of Heap for this configuration. In Qubole, since Heap is 70% of Physical Memory, you can set a maximum of 42% of Physical Memory for this configuration.
query.max-total-memory-per-node Greater than the maximum memory per node NA It denotes the sum of the user and the system memory that a query may use on a machine. It is only supported in Presto 0.208 and later versions. The value of this parameter should be greater than query.max-memory-per-node as this parameter considers both the user and system memory reservation while the latter only considers the user memory reservation.
query.max-memory 80GB, 20TB 100TB Maximum memory that a query can take aggregated across all nodes. To decrease or modify the default value, add it as a Presto override or set the query_max_memory session property.
query.schedule-split-batch-size 1000, 10000 1000 Number of schedule splits at once
query.max-queued-queries 6000 5000 Denotes the number of queries that can be queued. See Queue Configuration for more information on advanced queuing configuration options.
optimizer.optimize-single-distinct false true This implies that the single distinct optimization tries to replace multiple DISTINCT clauses with a single GROUP BY clause, which can substantially speed up the query execution. It is only supported in Presto 0.193 and earlier versions.
resources.reserved-system-memory 40/41% of Physical Memory 40% of Physical Memory This is the resources reserved system memory and if you set its value more 42% of Physical Memory, cluster failures occur.

Understanding the Task Management Properties

Parameter Examples Default Description
task.max-worker-threads 10, 20 4 * cores Maximum worker threads per JVM
task.writer-count The value must be a power of 2. 1

It is the number of concurrent Writer tasks per worker per query when inserting data through INSERT OR CREATE TABLE AS SELECT operations. You can set this property to make the data writes faster. The equivalent session property is task_writer_count and its value must also be a power of 2. For more information, see Configuring the Concurrent Writer Tasks Per Query.

Caution

Use this configuration judiciously to prevent overloading the cluster due to excessive resource utilization. So it is recommended to use higher value through session properties for queries which generate bigger outputs. For example, ETL jobs.

Understanding the Timestamp Conversion Properties

Parameter Examples Default Description
client-session-time-zone Asia/Kolkata NA The timestamp fields in output are automatically converted into the timezone specified by this property. It is helpful when you are in a different timezone than the Presto Server in which case the timestamp fields in the output would be displayed in the server timezone if this configuration is not set.

Understanding the Query Retry Mechanism Properties

Parameter Examples Default Description
retry.autoRetry true false It enables the Presto query retry mechanism feature at the cluster level.
retrier.max-wait-time-local-memory-exceeded 2m, 2s 5m It is the maximum time to wait for Presto to give up on retrying while waiting for new nodes to join the cluster, if the query has failed with the LocalMemoryExceeded error. Its value is configured in seconds or minutes. For example, its value can be 2s, or 2m, and so on. Its default value is 5m. If a new node does not join the cluster within this time period, Qubole returns the original query failure response.
retrier.max-wait-time-node-loss 2m, 2s 3m It is the maximum time to wait for Presto to give up on retrying while waiting for new nodes to join the cluster if the query has failed due to the Spot node loss. Its value is configured in seconds or minutes. For example, its value can be 2s, or 2m, and so on. Its default value is 3m. If a new node does not join the cluster within this configured time period, the failed query is retried on the smaller-sized cluster.
retry.nodeLostErrors
(written in the next column) It is a comma-separated list of Presto errors (in a string form) that signify the node loss. The default value of this property is "REMOTE_HOST_GONE","TOO_MANY_REQUESTS_FAILED","PAGE_TRANSPORT_TIMEOUT".