Spark 2.3-latest is now set to Spark 2.3.2 in the QDS UI. Spark clusters running 2.3-latest will run 2.3.2 after a cluster restart.
Once Qubole support has activated RubiX caching for your account, use the QDS UI to enable it for a cluster by checking the Enable Rubix check box under the Advanced tab of the Clusters page when you create or modify a Spark cluster.
RubiX caching is supported only for Azure Blob storage (WASB).
Qubole Job History Server Upgrade¶
SPAR-3053: The multi-tenant Qubole Job History Server has been upgraded to Spark 2.3 (2.3.1 by default). This server makes available the logs and history of Spark jobs that ran on clusters that have since been terminated.
- SPAR-3003: Cluster images now include the PyArrow package to support Pandas UDFs, enabling performance improvements in Spark 2.3.1. This enhancement is available via Support and is disabled by default for Spark 2.3.1. It is enabled by default for Spark 2.4 and later versions.
- SPAR-2649: You can now dynamically change
max executorsfor a running Spark application from the Executors tab of the Spark Application UI. This capability is supported in Spark 2.3.1 and later versions.
- SPAR-3059: Fixes the following problem with native Optimized Row Columnar (ORC) with
DirectFileOutputCommitter: if a task failed after writing partial files, the re-attempt also failed with
FileAlreadyExistsExceptionand the job failed. Fixed in Spark 2.4.