What’s New and Key Enhancements
The new features and key enhancements are listed in the corresponding tabs below.
Note
The label (in blue) that is against the description indicates the launch stage, availability, and the default state of the feature/enhancement. For more information, click the label.
Unless stated otherwise, features are generally available, available as self-service and enabled by default.
Qubole now lets you enable and disable features at the account level through a self-service interface.
Qubole has updated its API throttling policy. Gradual Rollout
Qubole now supports an account-level node bootstrap. Learn more. | Cluster Restart Required
Qubole has separated the configuration of the coordinator node from minimum number of nodes. Learn more. | Cluster Restart Required
Qubole now allows setting
capacity-optimized
as the spot allocation strategy for clusters. Learn more. | Gradual Rollout | Cluster Restart RequiredQubole has added support for new instances. Learn more. | Cluster Restart Required
QDS now gracefully terminates commands running on the cluster after its health check fails. Learn more.
Qubole has enhanced its mechanism of handling cluster terminations. Learn more. | Gradual Rollout | Cluster Restart Required
Qubole has improved cluster notifications. Learn more.
Workbench is now generally available.
Workbench lets you customize columns that you can view in the Results pane.
Quest, a Data Engineering product offered by Qubole is renamed as Qubole Pipelines Service. The Quest UI is now called the Pipelines UI.
Qubole Pipelines Service is now available as a BETA feature for all the user accounts. Beta.
Since the Jupyter Notebook Command is now available in QuboleOperator as
jupytercmd
, the users can schedule their Jupyter Notebooks in Airflow (Cluster Restart Required).Qubole provides the ability to pass the argument
arguments=['true']
in Qubole Operator’sget_result()
method to retrieve headers (Cluster Restart Required).Qubole now supports Apache Airflow version 1.10.9QDS. This Airflow version is supported only with python version 3.7.
The users now have an option on the Clusters page under the cluster details to delete the DAGs uploaded by the user through DAG explorer (Cluster Restart Required).
For Airflow version 1.10.9QDS, Qubole exports metrics via statsd. Prometheus scrapes metrics from the
statsd
and Qubole displays it on Grafana from there.Qubole now allows the users to customize the port where the Hive Metastore Server (HMS) runs on clusters.
Qubole plans to gracefully terminating Shell CLI commands. Learn more. Gradual Rollout | Cluster Restart Required
End of Life for Hadoop 1 and Hadoop 2.8. Learn more.
Hive 1.2 has been deprecated on Hadoop 2 clusters. Learn more.
Qubole has upgraded the Qubole-managed Hive metastore DB’s schema to Hive 2.3. Learn more.
Hive version 3.1.1 (beta) is now more robust and performant. Learn more.
You can now run Hive version 2.3 with Tez version 0.9.1. Learn more.
Qubole’s Hive version 2.3 is now at par with open-source Hive version 2.3.6. Learn more.
Jupyter notebooks provide Qviz, a data visualization framework that enables users to visualize dataframes with improved charting options and python plots on the Spark driver. Gradual Rollout.
In Jupyter notebooks, users can use the
%run
magic to run a notebook from the current notebook.In Jupyter notebooks, users can get autocomplete suggestions for Spark and PySpark notebooks, and docstring help in PySpark notebooks.
The Package Management UI has been redesigned and has certain enhancements. Users of the existing accounts should contact Qubole Support to enable this feature. Learn more.
Presto version 317 is generally available now. Learn more. | Cluster Restart Required
The BigQuery connector is now available on Presto version 317. Learn more.
Presto introduces dynamic concurrency and hybrid autoscaling. Learn more. | Gradual Rollout |Beta | Cluster Restart Required
Enhancements in Qubole Presto for JDBC and ODBC drivers. Learn more.
Improvements in Dynamic Filtering. Learn more. | Cluster Restart Required
Improvements in reading Hive ACID tables. Learn more.
Qubole has added new Datadog alerts and removed a Datadog alert. Learn more.
Qubole has added weighted distribution of CPU in worker nodes-based on resource groups Learn more. | Beta, Disabled | Cluster Restart Required
Users can perform the Spark SQL UPDATE and DELETE operations on Hive ACID tables. Contact Qubole Support to enable this feature.
Users can write results of a streaming query to a Hive Acid Table. Contact Qubole Support to enable this feature.
With Adaptive Query Execution, the query execution is optimized at the runtime based on the runtime statistics. Gradual Rollout.
Using the Dynamic Filtering values, Dynamic Partition Pruning selects the specific partitions within the table that needs to be read at runtime to improve the performance. Gradual Rollout.