Clone a Cluster

POST /api/v1.3/clusters/(string: id_or_label)/clone

Clone a cluster from an existing one. All the attributes of the source cluster (except the label) are copied over to the new cluster, but you can override any of them when creating the clone.

Note

You can now enable HiveServer2 through a Hadoop 2 request API call with additional settings as described in engine_config for Enabling HiveServer2 on a Hadoop 2 (Hive) Cluster. For details on configuring multi-instance HS2 through REST API, see Choosing Multi-instance as an option for running HiveServer2 on Hadoop (Hive) Clusters.

QDS supports defining account-level default cluster tags through the UI and plans to provide API support shortly. For more information, see Adding Account and User level Default Cluster Tags (AWS).

Required Role

The following users can make this API call:

  • Users who belong to the system-admin group.
  • Users who belong to a group associated with a role that allows cloning a cluster. See Managing Groups and Managing Roles for more information.

Parameters

Note

Parameters marked in bold below are mandatory. Others are optional and use the values from the source cluster. Presto is not currently supported on all Cloud platforms; see QDS Components: Supported Versions and Cloud Platforms.

Parameter Description
label A list of labels that identify the cluster. At least one label must be provided when creating a cluster. You must provide a new label to clone a cluster.
presto_version

It is mandatory and only applicable to a Presto cluster. The supported values are:

  • 0.157 (deprecated version)
  • 0.180 (the stable version)
  • 0.193 (the default and stable version)
  • 0.208 (the latest stable version)
spark_version It is mandatory and only applicable to a Spark cluster. The supported values are: 1.6.2, 2.0.2, 2.1.1, 2.2.0, 2.2.1, 2.3.1, and 2.4.0. For more information, see QDS Components: Supported Versions and Cloud Platforms. Deprecated versions: 1.3.1, 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.6.0, 1.6.1, 2.0.0, and 2.1.0.
zeppelin_interpreter_mode This parameter is only applicable to the Spark cluster. The default mode is legacy. Set it to user mode if you want the user-level cluster-resource management on notebooks. See Using the User Interpreter Mode for Spark Notebooks for more information.
ec2_settings Amazon EC2 Settings. The default values are considered if the settings are not configured.
node_configuration Cluster node instances type and other settings.
hadoop_settings Hadoop cluster settings that also contains the configuration description to enable Spark on the cluster.
security_settings Instance security settings.
presto_settings Presto cluster settings.
spark_settings Spark cluster settings.
datadog_settings Datadog cloud monitoring settings. Qubole supports the Datadog cloud monitoring service on Hadoop 2 (Hive) clusters.
disallow_cluster_termination Prevents auto-termination of the cluster after a prolonged period of disuse. The default value is, false.
enable_ganglia_monitoring Enable Ganglia monitoring for the cluster. The default value is, false.
node_bootstrap_file A file that gets executed on every node of the cluster at boot time. You can use this to customize your cluster nodes by setting up environment variables, installing required packages, etc. The default value is, node_bootstrap.sh.

ec2_settings

Parameter Description
compute_access_key The EC2 Access Key (Note: This field is not visible to non-admin users.)
compute_secret_key The EC2 Secret Key (Note: This field is not visible to non-admin users.)
aws_region The AWS region to create the cluster in. The default value is, us-east-1. Valid values are, us-east-1, us-east-2, us-west-1, us-west-2, eu-west-1, eu-west-2, sa-east-1, ap-south-1, ap-southeast-1, ap-northeast-1, ap-northeast-2, and ca-central-1.
aws_preferred_availability_zone The preferred availability zone (AZ) in which the cluster must be created. The default value is Any. However, if the cluster is in a VPC, then you cannot set the AZ.
vpc_id The ID of the Virtual Private Cloud (VPC) in which the cluster is created. In this VPC, the enableDnsHostnames parameter must be set to true.
subnet_id The ID of the subnet that must belong to the above VPC in which the cluster is created and it can be a public/private subnet. Qubole supports multiple subnets. Specify multiple subnets in this format: "subnet_id": "subnet-id1, subnet-id2, ...., subnet-idn".
master_elastic_ip It is the Elastic IP address for attaching to the cluster master. For more information, see this documentation.
bastion_node_public_dns Specify the Bastion host public DNS name if private subnet is provided for the cluster in a VPC. Do not specify this value for a public subnet.
bastion_node_port It is the port of the Bastion node. The default value is 22. You can specify a non-default port if you want to access the cluster that is in a VPC with a private subnet.
bastion_node_user It is the Bastion node user, which is ec2-user by default. You can specify a non-default user using this option.
role_instance_profile It is a user-defined IAM Role name that you can use in a dual-IAM role configuration. This Role overrides the account-level IAM Role and only you (and not even Qubole) can access this IAM Role and thus it provides more security. For more information, see Creating Dual IAM Roles for your Account.
use_account_compute_creds Set it to true to use the account’s compute credentials for all clusters of the account. The default value is false. This option is not supported on clusters of an IAM-Role-based account.
instance_tenancy

QDS provides instance tenancy at the cluster level and only in a VPC. The choice for tenancies are: default or dedicated. The dedicated instance_tenancy would mean the instances launched do not share physical hardware with any other instances outside of the respective AWS account.

Note

A cloned cluster would get the instance_tenancy setting if the parent cluster had it configured.

node_configuration

Parameter Description
master_instance_type The instance type to use for a cluster master node. The default value is m1.large for Hadoop-1, Hadoop-2, and Presto clusters. The default value is m3.xlarge for a Spark cluster.
slave_instance_type The instance type to use for cluster worker nodes. The default value is m1.xlarge for Hadoop-1, Hadoop-2, and Presto clusters. The default value is m3.2xlarge for a Spark cluster.
heterogeneous_instance_config Qubole supports configuring heterogeneous nodes in Hadoop 2 and Spark clusters. It implies that worker nodes can be of different instance types. For more information, see heterogeneous_instance_config and An Overview of Heterogeneous Nodes in Clusters.
initial_nodes The number of nodes to start the cluster with. The default value is 2.
max_nodes The maximum number of nodes up to which the cluster can be autoscaled. The default value is 2.
slave_request_type

The request type for the autoscaled worker instances. The default value is spot. The valid values are, ondemand, spot, or spot block. The master node and minimum worker node request type depends on whether or not the stable_spot_instance_settings or spot_block_settings are passed. For more information, see Master and Minimum Number of Nodes in a Cluster.

Qubole allows you to set spot block as the slave request type even when the master node type is On-Demand. Configuring Spot Blocks describes how to configure Spot blocks for autoscaling even when the master node type is On-Demand.

Note

The feature to set Spot blocks as autoscaling nodes even when the master node and minimum worker nodes are On-Demand nodes, is available for a beta access and it is only applicable to Hadoop 2 (Hive) clusters. Create a ticket with Qubole Support to enable it on the account. For more information, see Configuring Spot Blocks.

spot_instance_settings The purchase options for autoscaling worker spot instances and these are not applicable to the minimum number of nodes that is initial_nodes.
stable_spot_instance_settings Purchases both master node(s) and worker node(s) as Spot Instances only. The bid price is given using the stable_spot_instance_settings. The master node and minimum worker node request type depends on whether or not the stable_spot_instance_settings are passed. For more information, see Master and Minimum Number of Nodes in a Cluster.
spot_block_settings Spot Blocks are Spot instances that run continuously for a finite duration (1 to 6 hours). They are 30 to 45 percent cheaper than On-Demand instances based on the requested duration. For more information, see spot_block_settings. QDS ensures that Spot blocks are acquired at a price lower than On-Demand nodes. It also ensures that autoscaled nodes are acquired for the remaining duration of the cluster. For example, if the duration of a Spot block cluster is 5 hours and there is a need to autoscale at the 2nd hour, Spot blocks are acquired for 3 hours.
fallback_to_ondemand Fallback to on-demand nodes if spot nodes could not be obtained when adding nodes during autoscaling. It is valid only if worker request type is spot. The default value is false if slave_request_type is spot. Qubole also falls back to On-Demand nodes when master-and-minimum-number-of-nodes’ cluster composition is spot nodes.
ebs_volume_type

The default EBS volume type is standard (magnetic). The other possible values are ssd (gp2, General Purpose SSD), st1 (Throughput Optimized HDD), and sc1 (Cold HDD). For more information, see this blog. EBS volumes are attached to increase storage on instance types that come with low storage but have good CPU and memory configuration.

Note

For recommendations on using EBS volumes, see AWS EBS Volumes.

ebs_volume_size

The default EBS volume size is 100 GB for Magnetic/General Purpose SSD volume types and 500 GB for Throughput Optimized HDD/Cold HDD volume type. The supported value range is 100 GB/500 GB to 16 TB. The minimum and maximum volume size varies for each EBS volume type and are mentioned below:

  • For standard (magnetic) EBS volume type, the supported value range is 100 GB to 1 TB.
  • For ssd/gp2 (General Purpose SSD) EBS volume type, the supported value range is 100 GB to 16 GB.
  • For st1 (Throughput Optimized HDD) and sc1 (Cold HDD), the supported value range is 500 GB to 16 TB.

Note

For recommendations on using EBS volumes, see AWS EBS Volumes.

ebs_volume_count The number of EBS volumes to attach to each cluster instance. The default value is 0.
ebs_upscaling_config

Hadoop 2 and Spark clusters that use EBS volumes can now dynamically expand the storage capacity. This relies on Logical Volume Management. When enabled, a volume group is created on this volume group. Additional EBS volumes are attached to the instance and to the logical volume when the latter is approaching full capacity usage and the file system is resized to accommodate the additional capacity. This is not enabled by default. Storage-capacity upscaling in Hadoop2/Spark clusters using EBS volumes also supports upscaling based on the rate of increase of used capacity.

Note

For the required EC2 permissions, see Sample Policy for EBS Upscaling.

Here is an ebs_upscaling_config example.

"node_configuration" : {
  "ebs_upscaling_config": {
     "max_ebs_volume_count":5,
     "percent_free_space_threshold":20.0,
     "absolute_free_space_threshold":100,
     "sampling_interval":40,
     "sampling_window":8
      }
  }

See ebs_upscaling_config for information on the configuration options.

custom_ec2_tags

It is an optional parameter. Its value contains a <tag> and a <value>. For example, custom-ec2-tags ‘{“key1”:”value1”, “key2”:”value2”}’. A set of tags to be applied on the AWS instances created for the cluster and EBS volumes attached to these instances. Specified as a JSON object, for example, {“project”: “webportal”, “owner”: “john@example.com”}. It contains a custom tag and value. You can set a custom EC2 tag if you want the instances of a cluster to get that tag on AWS. The custom tags are applied to the Qubole-created security groups (if any).

Tags and values must have alphanumeric characters and can contain only these special characters: + (plus-sign), . (full-stop/period/dot), - (hyphen), @ (at-the-rate of symbol), = (equal sign), / (forward slash), : (colon) and _ (an underscore). The tags, Qubole and alias are reserved for use by Qubole (see Qubole Cluster EC2 Tags (AWS)). Tags beginning with aws- are reserved for use by Amazon.

Qubole supports defining user-level EC2 tags. For more information, see Adding Account and User level Default Cluster Tags (AWS).

idle_cluster_timeout The default cluster timeout is 2 hours. Optionally, you can configure it between 0 to 6 hours that is the value range is 0-6 hours. The unit of time supported is only hour. If the timeout is set at account level, it applies to all clusters within that account. However, you can override the timeout at cluster level. The timeout is effective on the completion of all queries on the cluster. Qubole terminates a cluster in an hour boundary. For example, when idle_cluster_timeout is 0, then if there is any node in the cluster near its hour boundary (that is it has been running for 50-60 minutes and is idle even after all queries are executed), Qubole terminates that cluster.
idle_cluster_timeout_in_secs

After enabling the aggressive downscaling feature on the QDS account, the Cluster Idle Timeout can be configured in seconds. Its minimum configurable value is 300 seconds and the default value would still remain 2 hours (that is 120 minutes or 7200 seconds).

Note

This feature is only available on a request. Contact the account team to enable this feature on the QDS account.

node_base_cooldown_period

With the aggressive downscaling feature enabled on the QDS account, it is the cool down period set in minutes for On-Demand nodes on a Hadoop 2 or a Spark cluster. The default value is 10 minutes. For more information, see Understanding Aggressive Downscaling in Clusters (AWS).

Note

This feature is only available on a request. Contact the account team to enable this feature on the QDS account. You must not set the Cool Down Period to a value lower than 5 minutes. If you set it very low, the node still does not get terminated unless it is decommissioned from HDFS.

With the aggressive downscaling feature enabled on the QDS account, it is the cool down period set in minutes for cluster nodes on a Presto cluster. The default value is 5 minutes. For more information, see Understanding Aggressive Downscaling in Clusters (AWS).

Note

This feature is only available on a request. Contact the account team to enable this feature on the QDS account.

node_spot_cooldown_period

With the aggressive downscaling feature enabled on the QDS account, it is the cool down period set in minutes for Spot nodes on a Hadoop 2 or a Spark cluster. The default value is 15 minutes. For more information, see Understanding Aggressive Downscaling in Clusters (AWS). It is not applicable to Presto clusters as node_base_cooldown_period is used for both On-Demand and Spot nodes in case of a Presto cluster (as described above).

Note

This feature is only available on a request. Contact the account team to enable this feature on the QDS account. You must not set the Cool Down Period to a value lower than 5 minutes. If you set it very low, the node still does not get terminated unless it is decommissioned from HDFS.

root_volume_size Use this parameter to configure the root volume of cluster instances. The supported range for the root volume size is 60 - 2047.

Master and Minimum Number of Nodes in a Cluster

To add the Master and Minimum Number of Nodes in a cluster, you can use Stable Spot Instance, Spot Blocks, or On-Demand nodes. You can set the cluster composition by using one of these configuration types:

  • OnDemand: It is the default value. This applies to On-Demand nodes.
  • stable_spot_instance_settings. This applies to Spot Instances. For example, stable_spot_instance_settings: {maximum_bid_price_percentage: "", timeout_for_request: ""}.
  • spot_block_settings. This applies to Spot Blocks. For example, spot_block_settings: {duration: ""}.

Cluster Composition Settings (AWS) describes how to configure the Master and Minimum Number of Nodes through the Clusters UI page.

heterogeneous_instance_config

See An Overview of Heterogeneous Nodes in Clusters for more information.

Parameter Description
memory

To configure the heterogeneous cluster, you must provide a list of whitelisted set of instance_types as shown in the following example.

"node_configuration":{
   "heterogeneous_instance_config":{
       "memory": {
                  [
                   {"instance_type": "m4.4xlarge", "weight": 1.0},
                   {"instance_type": "m4.2xlarge", "weight": 0.5},
                   {"instance_type": "m4.xlarge", "weight": 0.25}
                  ]
                 }
           }
       }

The following points about the instance types hold good for an heterogeneous cluster:

  • The whitelisted instance types are specified in an array with weights based on the available memory (Qubole plans to provide weighing on resources such as CPU in the future).

  • The first instance type must be the same as the cluster’s slave_instance_type and have a weight of 1.0. This is the primary instance type. Ensure that the first instance type is primary instance type if you are using Qubole’s APIs to create an heterogeneous cluster.

    Specify the instance weight as only floating numbers such as 1.0 and 2.0.

  • Qubole would try the rest of the instance types whenever it needs to provision nodes and when nodes from the earlier list are unavailable. The number of instances requested is decided by the weight. For example, during autoscaling, Qubole decides that it needs 10 m4.4xlarge nodes. But if this instance type is unavailable, Qubole tries to get 20 m4.2xlarge nodes instead. It is only valid for On-Demand nodes.

    However, with spot instances, Qubole uses AWS spot fleet, so, Qubole would get the cheapest combination of nodes of different types that satisfies the target capacity.

ebs_upscaling_config

Note

For the required EC2 permissions, see Sample Policy for EBS Upscaling.

Parameter Description
max_ebs_volume_count The maximum number of EBS volumes that can be attached to an instance. It must be more than ebs_volume_count for upscaling to work.
percent_free_space_threshold The percentage of free space on the logical volume as a whole at which addition of disks must be attempted. The default value is 25%, which means new disks are added when the EBS volume is (greater than or equal to) 75% full.
absolute_free_space_threshold The absolute free capacity of the EBS volume above which upscaling does not occur. The percentage threshold changes as the size of the logical volume increases. For example, if you start with a threshold of 15% and a single disk of 100GB, the disk would upscale when it has less than 15GB free capacity. On addition of a new node, the total capacity of the logical volume becomes 200GB and it would upscale when the free capacity falls below 30GB. If you would prefer to upscale only when the free capacity is below a fixed value, you may use the absolute_free_space_threshold. The default value is 100, which means that if the logical volume has at least 100GB of capacity, Qubole would not add more EBS volumes.
sampling_interval It is the frequency at which the capacity of the logical volume is sampled. Its default value is 30 seconds.
sampling_window

It is the number of sampling_intervals over which Qubole evaluates the rate of increase of used capacity. Its default value is 5. This means that the rate is evaluated over 150 (30 * 5) seconds by default. To disable upscaling based on rate and use only thresholds, this value may be set to 1. When the rate-based upscaling is set to 1, then absolute_free_space_threshold is monitored at sampling_interval.

The logical volume is upscaled if, based on the current rate, it is estimated to get full in (sampling_interval + 600) seconds (the additional 600 seconds is because the addition of a new EBS volume to a heavily loaded volume group has been observed to take up to 600 seconds.) Here is an example how the free space threshold decrease with respect to the Sample Window and Sample Interval. Assuming the default value of sampling_interval (30 seconds) and sampling_window (5), this is how the free space threshold decreases:

  • 0th second:100%free space
  • 30th second:95.3% free space
  • 60th second: 90.6% free space
  • 90th second: 85.9% free space
  • 120th second: 81.2% free space
  • 150th second: 76.5% free space
  • 180th second: 71.8% free space
  • 210th second: 67.1% free space
  • 600th second: 6% free space
  • 630th second: 1.3% free space

spot_instance_settings

Parameter Description
timeout_for_request The timeout for a Spot Instance request in minutes. The default value is 1 for new clusters. Qubole recommends you to use the default value of 1 minute in the existing clusters.
maximum_spot_instance_percentage The maximum percentage of instances that may be purchased from the AWS Spot market. The default value is 50.

stable_spot_instance_settings

Use this parameter to set master and minimum number of nodes in a cluster. For more information, see Master and Minimum Number of Nodes in a Cluster.

Parameter Description
timeout_for_request The timeout for a Spot Instance request in minutes. The default value is 1 for new clusters. Qubole recommends you to use the lower value of 1 minute in the existing clusters.

spot_block_settings

Use this parameter to set the master node and minimum number of nodes as described in Master and Minimum Number of Nodes in a Cluster, and worker nodes.

Parameter Description
duration

Set the duration in minutes. The accepted value range is 60-360 minutes and the duration must be a multiple of 60. It is set in node_configuration. Spot blocks are stable than spot nodes as they are not susceptible to being taken away for the specified duration. However, these nodes certainly get terminated once the duration for which they are requested for is completed. For more details, see AWS spot blocks. An example of Spot block can be as given below.

"node_configuration": {"spot_block_settings": {"duration":120} }

hadoop_settings

Parameter Description
use_hadoop2 Set this parameter value to true for starting Hadoop-2 daemons on a cluster. It is a mandatory setting for an Hadoop 2 cluster.
use_spark This is a mandatory setting for a Spark cluster. Its value must be true to start Spark daemons on the cluster.
custom_config The custom Hadoop configuration overrides. The default value is blank.
fairscheduler_settings The fair scheduler configuration options.
use_qubole_placement_policy Use Qubole Block Placement policy for clusters with spot nodes.

fairscheduler_settings

Parameter Description
fairscheduler_config_xml XML string with custom configuration parameters for the fair scheduler. The default value is, blank.
default_pool It is the default Fair Scheduler Queue if the queue is not submitted during job submission.

security_settings

It is now possible to enhance security of a cluster by authorizing Qubole to generate a unique SSH key every time a cluster is started. This feature is not enabled by default. Create a ticket with Qubole Support to enable this feature. Once this feature is enabled, Qubole starts using the unique SSH key to interact with the cluster. For clusters running in private subnets, enabling this feature generates a unique SSH key for the Qubole account. This SSH key must be authorized on the Bastion host.

Parameter Description
encrypted_ephemerals Qubole allows encrypting ephemeral drives on the instances. Create a ticket with Qubole Support to enable the block device encryption.
ssh_public_key SSH key to use to login to the instances. The default value is none. (Note: This parameter is not visible to non-admin users.) The SSH key must be in the OpenSSH format and not in the PEM/PKCS format.
persistent_security_group This option overrides the account-level security group settings. By default, this option is not set but inherits the account-level persistent security group, if any. Use this option if you want to give additional access permissions to cluster nodes. Qubole only uses the security group name for validation. So, do not provide the security group’s ID. You must provide a persistent security group when you configure outbound communication from cluster nodes to pass through a Internet proxy server.

presto_settings

Parameter Description
enable_presto Enable Presto on the cluster.
custom_config Specify the custom Presto configuration overrides. The default value is blank.

spark_settings

Parameter Description
custom_config Specify the custom Spark configuration overrides. The default value is blank.

datadog_settings

Note

This feature is enabled on Hadoop 2 (Hive), Presto, and Spark clusters. Once you set the Datadog settings, Ganglia monitoring gets automatically enabled. Although the Ganglia monitoring is enabled, its link may not be visible in the cluster’s UI resources list.

Parameter Description
datadog_api_token Specify the Datadog API token to use the Datadog monitoring service. The default value is NULL.
datadog_app_token Specify the Datadog APP token to use the Datadog monitoring service. The default value is NULL.

Response

The response contains a JSON object representing the new cluster. All the attributes mentioned here are returned (except when otherwise specified or redundant).

Examples

Goal

Clone the cluster with ID 116.

curl -X POST -H "X-AUTH-TOKEN:$X_AUTH_TOKEN" -H "Content-Type:application/json" -H "Accept: application/json" \
-d '{
       "label": ["clone_116"],
       "node_configuration": {
         "initial_nodes": 2,
         "slave_request_type": "ondemand",
         "slave_instance_type": "c3.xlarge",
         "max_nodes": 10,
         "master_instance_tye": "c3.large"
       },
       "enable_ganglia_monitoring": true
    }' \
https://api.qubole.com/api/v1.3/clusters/116/clone

Note

The above syntax uses https://api.qubole.com as the endpoint. Qubole provides other endpoints to access QDS that are described in Supported Qubole Endpoints on Different Cloud Providers.

Response

HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8

    "security_settings":{
     "encrypted_ephemerals":false
   },
   "enable_ganglia_monitoring":true,
   "label":[
   "116_clone"
   ],
   "ec2_settings":{
     "compute_validated":false,
     "compute_secret_key":"<your_ec2_compute_secret_key>",
     "aws_region":"us-west-2,
     "vpc_id":null,
     "aws_preferred_availability_zone":"Any",
     "compute_access_key":"<your_ec2_compute_access_key>",
     "subnet_id":null
   },
   "node_bootstrap_file":"node_bootstrap.sh",
   "hadoop_settings":{
   "use_hadoop2":false,
    "custom_config":null,
    "fairscheduler_settings":{
      "default_pool":null
    }
   },
   "disallow_cluster_termination":false,
   "presto_settings":{
   "enable_presto":false,
    "custom_config":null
   },
  "id":116,
  "state":"DOWN",
  "node_configuration":{
   "max_nodes":10,
   "master_instance_type":"c3.large",
   "slave_instance_type":"c3.xlarge",
   "use_stable_spot_nodes":false,
   "slave_request_type":"ondemand",
   "initial_nodes":2,
   "spot_instance_settings":{
     "maximum_bid_price_percentage":"100.0",
     "timeout_for_request":10,
     "maximum_spot_instance_percentage":60
    }
 }
}