Using the Spill to Disk Mechanism

Presto supports offloading intermediate operation results to disk for memory intensive operations. This is called Spill to Disk mechanism. It enables execution of queries which would otherwise fail due to memory requirements exceeding maximum memory per node limit (defined by query.max-memory-per-node). It is a best effort mechanism which increases the chances of success for queries with high memory requirements but it does not guarantee that all memory intensive queries succeed. For more information, see Spill to Disk.

Note

Qubole recommends using the Spill to Disk mechanism from Presto 0.208. For more information, see:

Enabling Spill to Disk Mechanism on a Presto Cluster

You can enable the Spill to Disk mechanism for a Presto cluster through the cluster configuration overrides as illustrated below.

config.properties:
experimental.spiller-spill-path=<path to the directory that will be used to write the spilled data>
experimental.spill-enabled=true
experimental.max-spill-per-node=250GB
experimental.query-max-spill-per-node=100GB

Enabling Spill to Disk Mechanism on a Session

To enable the Spill to Disk mechanism at the query level, use the session property as mentioned here.

set session spill_enabled =true

Note

Spill to Disk works only with local disks on worker nodes and so, it does not work with the cloud object storage (for example, S3).

You must set the directory to write the spilled data if you want to enable the Spill to Disk mechanism at the cluster level/session level. To set the location that is used to write the spilled data at a query level, use the cluster-level configuration property as mentioned here:

config.properties:
experimental.spiller-spill-path=<path to the directory that will be used to write the spilled data>

For more info on the cluster-level configuration, see Enabling Spill to Disk Mechanism on a Presto Cluster.

Spill Path on the Local Disk

Starting Presto version 0.208, there is a default value for a spill-path that is, the location on the disk where intermediate operation results are offloaded. The default location used for spilling with the current configuration is located on worker nodes at /media/ephemeral0/presto/spill_dir. The default directory allows you to easily enable the spill to disk configuration on a session or enable it at the cluster level for all queries by passing it as a Presto override.

Configuring the Maximum Spill Per Node

You should configure the experimental.max-spill-per-node property (size for maximum spill per node) by considering the free disk space on /media/ephemeral0.

Here is a sample command to check the disk space on /media/ephemeral0 along with its output.

[root@ip-<ip-address> ~]# df -ah
Filesystem      Size  Used Avail Use% Mounted on
proc               0     0     0    - /proc
sysfs              0     0     0    - /sys
/dev/xvda1       59G   34G   26G  58% /
devtmpfs         61G  3.9M   61G   1% /dev
devpts             0     0     0    - /dev/pts
tmpfs            61G     0   61G   0% /dev/shm
none               0     0     0    - /proc/sys/fs/binfmt_misc
/dev/xvdaa      296G   71M  293G   1% /media/ephemeral0
/dev/xvdp       197G   68M  195G   1% /media/ebs3
/dev/xvdo       197G   64M  195G   1% /media/ebs2

When RubiX is enabled, the hadoop.cache.data.fullness.percentage override defines the maximum amount of disk space it can use. Its default value is 80%. So, with RubiX enabled, you should define the experimental.max-spill-per-node property by considering the value of hadoop.cache.data.fullness.percentage.