Spark

New Features

Bug Fixes

  • SPAR-1098: Support for Spark command line spread over multiple lines using backslash at the end of each line.
  • SPAR-2331: An issue that caused a Spark notebook to hang has been resolved. The issue was mainly due to insufficient driver memory for that notebook and required fine-tuning the Spark interpreter settings.
  • SPAR-2458: Fixes a problem that caused the NodeManager not to terminate because the auxiliary ExternalShuffleService was running inside the NodeManager.
  • SPAR-2574: Fixes a problem with the ALTER TABLE RECOVER PARTITIONS command and spark.sql.qubole.recover.partitions. The issue occurred when there were invalid files and directories in the partition path.
  • SPAR-2584: Makes custom package deployment for Spark and Zeppelin more robust by handling error conditions and adding retries.

Improvements

  • SPAR-2217: QDS now supports Spark 2.2.1; this version is reflected as 2.2 latest (2.2.1) in the Spark cluster UI. All 2.2.0 clusters are automatically upgraded to 2.2.1 in accordance with Qubole’s Spark versioning policy.

    Note

    Spark 2.2.1 as the latest version will be rolled out in a patch after the R52 release.

  • SPAR-2210: Qubole Spark supports the Hive 2.1 metastore for Spark 2.2.x. This feature is available for Beta access.

  • SPAR-2166: The default value of max-executors for a Spark application has been increased from 2 to 1000. If you want to use a different value, set the spark.dynamicAllocation.maxExecutors configuration explicitly at the Spark application level. If you want a different value for all Spark applications run on a cluster, set the value as a Spark override on that cluster. This takes effect only when the Spark application is run from the Analyze UI or through a REST API call. It does not apply to Spark notebooks.

  • SPAR-2264: Spark defaults have been changed for some instance types to ensure they allocate not less than 8 GB to each executor if possible. The default executor size for all instance types is now between 8 GB and 16 GB. To use executors with a larger memory allocation (up to 64 GB) on bigger instance types, create a ticket with Qubole Support.

  • SPAR-2332: These JARs have been updated in Spark 2.1 and 2.2 clusters:

    snowflake-jdbc: Upgraded from 3.4.0 to 3.5.3 spark-snowflake_2.11: Upgraded from 2.2.8 to 2.3.0