What is New
The new features and enhancements are listed in the section below.
Note
The label (in blue) that is against the description indicates the launch state, availability, and the default state of the feature/enhancement. For more information, click the label.
Unless stated otherwise, features are generally available, available as self-service and enabled by default.
Cluster Management
Qubole now supports
c5d
,m5d
,r5
,r5d
, andz1d
instances. Learn more.QDS has introduced a new cluster console providing granular visibility, enhanced governance and ease of use with features such as an ability to track a specific cluster’s activity, view history of its configuration changes, cluster usage, and instances. Learn more. Beta
QDS supports configuring multiple subnets in a VPC when clusters are configured with all compositions including the heterogeneous clusters. Learn more. Cluster Restart Required
Hive
Hive 2.1 version is now generally available. Learn more. Cluster Restart Required
Qubole has added HA Proxy in the cluster coordinator node for balancing the load when there are multiple connections between the cluster and the Qubole-managed Hive Metastore. This removes a single point of failure and provides more stability. Learn more. Via Support
Hadoop 1
Hadoop 1 as-a-service is deprecated from this QDS version. Qubole will support Hadoop 1 on existing clusters until 31 December 2018. Creating a new Hadoop 1 cluster or cloning an existing Hadoop 1 cluster is not supported through the API and the UI. Learn more.
Presto
Qubole has added a new feature to automatically retry the unsuccessful Presto queries (if possible) when the nodes are being added as part of autoscaling. Learn more. Disabled Cluster Restart Required
Presto Notebooks are now generally available. Learn more.
The latest supported version is Presto 0.208. Learn more. Beta Cluster Restart Required
Spark
Qubole Spark supports RubiX distributed file caching system. Learn more. Beta Via Support Disabled
Qubole Spark provides Dynamic Filtering for join query performance improvement. Learn more. Via Support Disabled
Sparklens experimental open service tool is available on http://sparklens.qubole.net. Learn more.
Parquet footer metadata caching to improve query execution performance. Learn more. Via Support Disabled
Proactive cleanup of shuffle block data to enable faster downscaling of nodes. Learn more. Via Support Disabled
Autoscaling is enabled by default for clusters. The default value for the maximum number of autoscaling nodes has been increased from 2 to 10 for a new Spark cluster. Learn more.
Large Spark SQL commands are now supported in API and Analyze page. Learn more. Via Support Disabled
Qubole Spark command subtypes are now supported with script files containing macros. Learn more. Via Support Disabled
Deprecated Spark Versions
In this release, the following Spark versions are deprecated: 1.5.1, 1.6.0, 1.6.1, 2.0.0, and 2.1.0. Qubole will continue to support Spark 1.6.2 and latest maintenance versions of each minor version in Spark 2.x. See version support documentation.
Spark Structured Streaming
Qubole provides comprehensive support for Kinesis connector in Structured Streaming.
Support to read from Kinesis Source in micro-batch streaming and continuous streaming modes.
Support to ingest data into Kinesis.
Support for IAM roles in Kinesis Connector.
Support to write a structured streaming query to a Spark data source table. Learn more.
Streaming query progress graphs are now displayed in notebooks. Learn more. Via Support Disabled
Direct writes for checkpointing is supported. Learn more. Via Support Disabled
Notebooks
Notebooks can be exported as PNG, PDF, and HTML files. Learn more. Beta Disabled
Cluster Status is available on the Notebooks page. Learn more. Beta Via Support Disabled
Spark application status is available on the Notebooks page. Learn more. Beta Via Support Disabled
Administration
Administrators can now track usage on Qubole using the QCUH dashboard. Beta
Qubole has introduced a new Service user type. Beta, Via Support, Disabled
Administrators can now configure an additional resource, Data Preview, on HIVE tables while managing resources on the Manage Roles page.
Data Analytics
QDS has introduced a functionality where customers can request setting a maximum command concurrent limit percentage for all users of an account. Via Support, Disabled
Data Engineering
Explore
Qubole has enabled support for AWS Aurora-MySQL RDS as a database backend for Airflow contents.
Airflow
Python 3.5 is now supported on Airflow clusters. You can now manage add-on packages using package management. Beta, Via Support, Disabled
Qubole now enables you to monitor the health of Airflow clusters using integrated Monit, and lets you turn-on/turn-off certain services. Cluster Restart Required
Continuous development, integration, and deployment of Airflow DAGs with the Qubole UI. Beta, Via Support, Disabled Cluster Restart Required
Security
Apache Ranger integration for Hive workloads to help security administrators in defining fine-grained data access policies across users and user groups.
Security administrators can define and enforce RBAC policies across multiple Qubole artifacts that contain data and metadata such as commands, data stores connections, data previews, and results.