Upgrade Considerations for Spark 3.0

When upgrading from Spark 2.4.x to Spark 3.0, you should review the following considerations and perform the necessary steps.

Changes in Spark 3.0.

You must…

yarn-client and yarn-cluster modes are removed in Spark 3.0.

Use --master=yarn and specify --deploy-mode option with the spark-submit command-line arguments.

Hivecontext class is removed from Spark 3.0.

Use SparkSession.builder.enableHiveSupport() instead of the Hivecontext class

The order of argument is reversed in the TRIM method for Spark 3.0.

Use TRIM(str, trimStr) instead of TRIM(trimStr, str)

Due to the upgrade to Scala 2.12, DataStreamWriter.foreachBatch is not a compatible source with Scala program.

Update your Scala source code to distinguish between Scala function and Java lambda.

Note

  • Spark does not raise an exception on SQL query with an implicit cross join.

  • Pyspark program errors are redirected to the Logs tab of the Analyze or workbench page.

For more information about upgrading to Spark 3.0, see Spark 3.0 Migration Guide.