What’s New and Key Enhancements

The new features and key enhancements are listed in the corresponding tabs below.

Note

The label (in blue) that is against the description indicates the launch stage, availability, and the default state of the feature/enhancement. For more information, click the label.

Unless stated otherwise, features are generally available, available as self-service and enabled by default.

  • Qubole now lets you enable and disable features at the account level through a self-service interface.
  • Qubole has updated its API throttling policy. Gradual Rollout

Learn more.

  • Workbench is now generally available.
  • Workbench lets you customize columns that you can view in the Results pane.

Learn more.

  • Quest, a Data Engineering product offered by Qubole is renamed as Qubole Pipelines Service. The Quest UI is now called the Pipelines UI.
  • Qubole Pipelines Service is now available as a BETA feature for all the user accounts. Beta.
  • Since the Jupyter Notebook Command is now available in QuboleOperator as jupytercmd, the users can schedule their Jupyter Notebooks in Airflow (Cluster Restart Required).
  • Qubole provides the ability to pass the argument arguments=['true'] in Qubole Operator’s get_result() method to retrieve headers (Cluster Restart Required).
  • Qubole now supports Apache Airflow version 1.10.9QDS. This Airflow version is supported only with python version 3.7.
  • The users now have an option on the Clusters page under the cluster details to delete the DAGs uploaded by the user through DAG explorer (Cluster Restart Required).
  • For Airflow version 1.10.9QDS, Qubole exports metrics via statsd. Prometheus scrapes metrics from the statsd and Qubole displays it on Grafana from there.
  • Qubole now allows the users to customize the port where the Hive Metastore Server (HMS) runs on clusters.

Learn more.

  • Hive 1.2 has been deprecated on Hadoop 2 clusters. Learn more.
  • Qubole has upgraded the Qubole-managed Hive metastore DB’s schema to Hive 2.3. Learn more.
  • Hive version 3.1.1 (beta) is now more robust and performant. Learn more.
  • You can now run Hive version 2.3 with Tez version 0.9.1. Learn more.
  • Qubole’s Hive version 2.3 is now at par with open-source Hive version 2.3.6. Learn more.
  • Jupyter notebooks provide Qviz, a data visualization framework that enables users to visualize dataframes with improved charting options and python plots on the Spark driver. Gradual Rollout.
  • In Jupyter notebooks, users can use the %run magic to run a notebook from the current notebook.
  • In Jupyter notebooks, users can get autocomplete suggestions for Spark and PySpark notebooks, and docstring help in PySpark notebooks.
  • The Package Management UI has been redesigned and has certain enhancements. Users of the existing accounts should contact Qubole Support to enable this feature. Learn more.

Learn more about the new features in Jupyter Notebooks.

  • Users can perform the Spark SQL UPDATE and DELETE operations on Hive ACID tables. Contact Qubole Support to enable this feature.
  • Users can write results of a streaming query to a Hive Acid Table. Contact Qubole Support to enable this feature.
  • With Adaptive Query Execution, the query execution is optimized at the runtime based on the runtime statistics. Gradual Rollout.
  • Using the Dynamic Filtering values, Dynamic Partition Pruning selects the specific partitions within the table that needs to be read at runtime to improve the performance. Gradual Rollout.