Upgrade Considerations for Spark 3.0¶
When upgrading from Spark 2.4.x to Spark 3.0, you should review the following considerations and perform the necessary steps.
Changes in Spark 3.0. | You must… |
---|---|
yarn-client and yarn-cluster modes are removed in Spark 3.0. |
Use --master=yarn and specify --deploy-mode option with the spark-submit command-line arguments. |
Hivecontext class is removed from Spark 3.0. | Use SparkSession.builder.enableHiveSupport() instead of the Hivecontext class |
The order of argument is reversed in the TRIM method for Spark 3.0. |
Use TRIM(str, trimStr) instead of TRIM(trimStr, str) |
Due to the upgrade to Scala 2.12, DataStreamWriter.foreachBatch is not a compatible source with Scala program. |
Update your Scala source code to distinguish between Scala function and Java lambda. |
Note
- Spark does not raise an exception on SQL query with an implicit cross join.
- Pyspark program errors are redirected to the Logs tab of the Analyze or workbench page.
For more information about upgrading to Spark 3.0, see Spark 3.0 Migration Guide.