Cluster Management

The new features and key enhancements are:

Other enhancements and bug fixes are listed in:

Node Bootstrap Script at Multiple Points in the Node Startup

ACM-4804: Qubole now supports running a bootstrap script at multiple points in the node startup sequence in YARN-based clusters. Cluster Restart Required

Specifically, you can add the following execution functions:

  • On the coordinator node:
    • Before any of the services are started
    • After all the services are started
  • On worker nodes:
    • Before any of the services are started
    • After services (such as data node) are started but before the NodeManager is started, that is before tasks can start running.
    • After the NodeManager is started.

For more information, see the documentation.

EBS Volumes Encrypted by Default

ACM-5277: QDS now enables encryption on EBS volumes by default. This is applicable to all volumes provisioned at the instance start as well as new EBS volumes added for upscaling. By default, Qubole uses the default encryption key for the account for the AWS region. To use KMS (SSE-C) keys at the account level, contact Qubole Support with the SSE-C key.

As data on NVMe volumes is also encrypted by default, block device encryption is now used only for instance store volumes on older generation (first and second) instance types. Qubole strongly recommends usage of newer (*3/*4/*5) generation instances with EBS or NVMe disks for large clusters.

As Qubole has enabled encryption on EBS volumes by default, it has removed the UI option for enabling encryption on the Clusters UI configuration page.

Proactive Replacement of Spot Nodes

ACM-5645: Spot block nodes are proactively replaced with newer nodes having a longer expiry time before AWS takes such nodes away to minimize an adverse impact on running queries.

Supported New Instances Types

ACM-5876: Qubole now supports m5n, m5dn, r5n, and r5dn instance types. Cluster Restart Required

ACM-6032: Qubole now supports G4 instance types. Cluster Restart Required

ACM-6165: Qubole now supports r5d.8xlarge, r5d.16xlarge, m5d.8xlarge, and m5d.16xlarge instance types. Cluster Restart Required

Changes in Heterogeneous Clusters Configuration

These are changes in the heterogeneous clusters configuration:

  • ACM-5607: When you try enabling heterogeneous configuration in the Clusters UI page, the UI now suggests instances similar to the chosen worker node type but from different generations instead of suggesting the instance of the double weight of the same generation (earlier). Gradual Rollout | Cluster Restart Required
  • ACM-6135: Qubole has now added support to use EBS disk size proportional to the node weight in heterogeneous clusters. When this enhancement is enabled, the EBS disk size that you specify in the Clusters configuration on the UI is with respect to the base worker type. For the remaining instance types in the heterogeneous configuration, Qubole multiplies this EBS disk size with the node weight (ratio of memory with respect to the base instance type). Gradual Rollout | Cluster Restart Required

Passing Cluster Configuration Properties as Cluster Overrides

ACM-5744: You can add the Spot request timeout and maximum price percentage of Spot nodes and coordinator and minimum nodes as cluster configuration overrides. Gradual Rollout | Cluster Restart Required

You can configure it from the Advanced Configuration page from the Clusters configuration UI while creating or updating a cluster on AWS. You can override these fields through the Cluster API as well.

For more information, see the documentation.

Enhancements

  • ACM-5785: If spot fleet request fails and no instance is allocated, then Qubole logs the error/information reason, which is visible on cluster start logs through the Clusters UI.
  • ACM-5799: Qubole Clusters UI/API now does not accept invalid special characters such as a semicolon in AWS EC2 custom tags.
  • ACM-5958: Qubole has done improvements in the cluster UI page’s loading to resolve the slowness in the page loading.

Bug Fixes

  • ACM-5629: Qubole has added a new parameter _exclude_job_stats_ to the jobs API. When you specify this parameter, the API does not fetch job stats for child jobs for a given Hadoop command. As a result, the API is much faster when this parameter is specified. In addition, you may use the API even when the cluster is down if this parameter is specified in a given API call.
  • ACM-5932: Fixed an issue where the nodes did not start with EBS volume mounted due to stale filesystem data.
  • ACM-5939: You can opt for OpenJDK 8 by contacting Qubole Support. When OpenJDK 8 is enabled on your account, OracleJDK 8 is removed from cluster nodes. Via Support | Cluster Restart Required
  • ACM-5941: On-Demand nodes are now terminated by default if they are shut down to avoid having them in the stopped state but still registered with the cluster. Earlier such nodes remained as part of the cluster even though they transitioned to the stopped state.

For a list of bug fixes between versions R57 and R58, see Changelog for api.qubole.com.