What is New¶
The new features and enhancements are listed in the section below.
The label (in blue) that is against the description indicates the launch state, availability, and the default state of the feature/enhancement. For more information, click the label.
Unless stated otherwise, features are generally available, available as self-service and enabled by default.
- Qubole now supports
z1dinstances. Learn more.
- QDS has introduced a new cluster console providing granular visibility, enhanced governance and ease of use with features such as an ability to track a specific cluster’s activity, view history of its configuration changes, cluster usage, and instances. Learn more. Beta
- QDS supports configuring multiple subnets in a VPC when clusters are configured with all compositions including the heterogeneous clusters. Learn more. Cluster Restart Required
- Hive 2.1 version is now generally available. Learn more. Cluster Restart Required
- Qubole has added HA Proxy in the cluster master node for balancing the load when there are multiple connections between the cluster and the Qubole-managed Hive Metastore. This removes a single point of failure and provides more stability. Learn more. Via Support
Hadoop 1 as-a-service is deprecated from this QDS version. Qubole will support Hadoop 1 on existing clusters until 31 December 2018. Creating a new Hadoop 1 cluster or cloning an existing Hadoop 1 cluster is not supported through the API and the UI. Learn more.
- Qubole has added a new feature to automatically retry the unsuccessful Presto queries (if possible) when the nodes are being added as part of autoscaling. Learn more. Disabled Cluster Restart Required
- Presto Notebooks are now generally available. Learn more.
- The latest supported version is Presto 0.208. Learn more. Beta Cluster Restart Required
- Qubole Spark supports RubiX distributed file caching system. Learn more. Beta Via Support Disabled
- Qubole Spark provides Dynamic Filtering for join query performance improvement. Learn more. Via Support Disabled
- Sparklens experimental open service tool is available on http://sparklens.qubole.net. Learn more.
- Parquet footer metadata caching to improve query execution performance. Learn more. Via Support Disabled
- Proactive cleanup of shuffle block data to enable faster downscaling of nodes. Learn more. Via Support Disabled
- Autoscaling is enabled by default for clusters. The default value for the maximum number of autoscaling nodes has been increased from 2 to 10 for a new Spark cluster. Learn more.
- Large Spark SQL commands are now supported in API and Analyze page. Learn more. Via Support Disabled
- Qubole Spark command subtypes are now supported with script files containing macros. Learn more. Via Support Disabled
Spark Structured Streaming¶
Qubole provides comprehensive support for Kinesis connector in Structured Streaming.
- Support to read from Kinesis Source in micro-batch streaming and continuous streaming modes.
- Support to ingest data into Kinesis.
- Support for IAM roles in Kinesis Connector.
Support to write a structured streaming query to a Spark data source table. Learn more.
- Notebooks can be exported as PNG, PDF, and HTML files. Learn more. Beta Disabled
- Cluster Status is available on the Notebooks page. Learn more. Beta Via Support Disabled
- Spark application status is available on the Notebooks page. Learn more. Beta Via Support Disabled
- Administrators can now track usage on Qubole using the QCUH dashboard. Beta
- Qubole has introduced a new Service user type. Beta, Via Support, Disabled
- Administrators can now configure an additional resource, Data Preview, on HIVE tables while managing resources on the Manage Roles page.
- QDS has introduced a functionality where customers can request setting a maximum command concurrent limit percentage for all users of an account. Via Support, Disabled
- Qubole has enabled support for AWS Aurora-MySQL RDS as a database backend for Airflow contents.
- Python 3.5 is now supported on Airflow clusters. You can now manage add-on packages using package management. Beta, Via Support, Disabled
- Qubole now enables you to monitor the health of Airflow clusters using integrated Monit, and lets you turn-on/turn-off certain services. Cluster Restart Required
- Continuous development, integration, and deployment of Airflow DAGs with the Qubole UI. Beta, Via Support, Disabled Cluster Restart Required
- Apache Ranger integration for Hive workloads to help security administrators in defining fine-grained data access policies across users and user groups.
- Security administrators can define and enforce RBAC policies across multiple Qubole artifacts that contain data and metadata such as commands, data stores connections, data previews, and results.