Spark 3.3.2 Features
Spark on Qubole provides customized features with Spark 3.3.2 as listed in the following table.
Feature |
Description |
Reference |
---|---|---|
Pandas API: Increased Coverage with New Features |
PySpark now natively understands datetime.timedelta across Spark SQL and Pandas API on Spark. |
|
This Python type now maps to the date-time interval type in Spark SQL. |
||
Many missing parameters and new API features are now supported for Pandas API on Spark. Examples include endpoints like ps.merge_asof, ps.timedelta_range and ps.to_timedelta. |
||
ANSI compliance |
Support of the ANSI interval data types. |
|
Can read/write interval values from/to tables. |
||
Use intervals in many functions/operators to do date/time arithmetic, including aggregation and comparison. |
||
Implicit casting in ANSI mode now supports safe casts between types while protecting against data loss. |
||
A growing library of “try” functions, such as “try_add” and “try_multiply”, complement ANSI mode allowing users to embrace the safety of ANSI mode rules while also still allowing for fault tolerant queries. |
||
New built-in functions |
A growing library of “try” functions, such as “try_add” and “try_multiply”. |
|
Nine new linear regression functions and statistical functions. |
||
Four new string processing functions. |
||
Aes_encryption and decryption functions. |
||
Generalized floor and ceiling functions. |
||
“To_number” formatting and others. |
For more information about all the features and enhancements of Apache Spark 3.3.2, see Apache Spark 3.3.2 documentation.