Create a Cluster on Oracle OCI¶
-
POST
/api/v2/clusters/
¶
Use this API to create a new cluster when you are using Qubole on the Oracle OCI. You create a cluster for a workload that has to run in parallel with your pre-existing workloads.
Required Role¶
The following users can make this API call:
- Users who belong to the system-user or system-admin group.
- Users who belong to a group associated with a role that allows creating a cluster. See Managing Groups and Managing Roles for more information.
Parameters¶
Note
Parameters marked in bold below are mandatory. Others are optional and have default values.
Parameter | Description |
---|---|
cloud_config | A list of labels that identify the cluster. At least one label must be provided when creating a cluster. |
cluster_info | It contains the configurations of a cluster. |
engine_config | It contains the configurations of the type of clusters |
monitoring | It contains the cluster monitoring configuration. |
security_settings | It contains the security settings for the cluster. |
cloud_config¶
Parameter | Description |
---|---|
provider | It defines the cloud provider. Set oracle when the cluster is created on QDS-on-Azure. |
compute_config | It defines the Azure account compute credentials for the cluster. |
location | It is used to set the geographical OCI location. |
network_config | It defines the network configuration for the cluster. |
storage_config | It defines the Azure account storage credentials for the cluster. |
compute_config¶
Parameter | Description |
---|---|
compute_validated | It denotes if the credentials are validated or not. |
use_account_compute_creds | It is to use account compute credentials. By default, it is set to false . Set it to true to use account compute credentials. By default, it is set to false . Set it to true to use account compute credentials.
Setting it to ``true`` implies that the following four settings are not required to be set. |
compute_tenant_id | The Tenancy OCID of the account in which you created the user in step 1 above. The Tenancy OCID appears at the bottom of the screen in the OCI Console. It is required when use_account_compute_creds is set to false . |
compute_user_id | The user ID of the QDS account owner that is the owner of the QDS account on OCI. It is required when use_account_compute_creds is set to false . |
compute_key_finger_print | The fingerprint Oracle provided when you uploaded the public key in step 2 of Configuring Oracle OCI Resources. It is required when use_account_compute_creds is set to false . |
network_config¶
Parameter | Description |
---|---|
compartment_id | It is the ID of the compartment where all your QDS instances and images will be stored and brought up, and (by default) where query output and logs will be stored as well. Compartments allow you to organize and control access to your cloud resources. A compartment is a collection of related resources (such as instances, virtual cloud networks, block volumes) that can be accessed only by certain groups that have been given permission by an administrator. Configuring Oracle OCI Resources provides more information. |
vcn_id | Set the ID of the Virtual Cloud Network if the cluster would be part of it. A Virtual Cloud Network is a virtual version of a traditional network including subnets, route tables, and gateways on which your instances run. A cloud network resides within a single region but can cross multiple Availability Domains. |
subnet_id | Set the subnet ID associated with the VCN. You can define subnets for a cloud network in different Availability Domains, but the subnet itself must belong to a single Availability Domain. |
bastion_node | It is the public IP address of bastion node to access private subnets if required. |
storage_config¶
Parameter | Description |
---|---|
block_volume_count | It is the count of block volumes to be mounted to an instance as reserved disks. Reserved disks provide additional storage for HDFS and intermediate data on instances with low storage density. Its default value is 0 and it can be
greater than 0 . |
block_volume_size | It is the size (in GB) of each block volume to be mounted to an instance as reserved disk. Its default value is 256 GB and supported value range is 50 to 2048. |
location¶
Parameter | Description |
---|---|
region | It is the region in which nodes are launched. The default and the only supported region
is us-phoneix-1 . |
availability_domain | The Oracle availability zone in which nodes are launched. The supported values are:
ncSu:PHX-AD1 , ncSu:PHX-AD2 , and ncSu:PHX-AD3 . |
cluster_info¶
Parameter | Description |
---|---|
label | A cluster can have one or more labels separated by a commas. You can make a cluster the default cluster by including the label “default”. |
master_instance_type | To change the coordinator node type from the default (Standard_A5), select a different type from the drop-down list. |
slave_instance_type | To change the worker node type from the default (Standard_A5), select a different type from the drop-down list. |
min_nodes | Enter the minimum number of worker nodes if you want to change it (the default is 1). |
max_nodes | Enter the maximum number of worker nodes if you want to change it (the default is 1). |
node_bootstrap | You can append the name of a node bootstrap script to the default path. |
disallow_cluster_termination | Set it to true if you do not want QDS to terminate idle clusters automatically. Qubole recommends that you to set this parameter to false . |
custom_tags | It is an optional parameter. Its value contains a <tag> and a <value>. |
rootdisk | Use this parameter to configure the root volume of cluster instances. You must configure its size within this parameter. The supported range for the root volume size is 90 - 2047 . An example usage would be
"rootdisk" => {"size" => 500} . |
engine_config¶
Parameter | Description |
---|---|
flavour | Denotes the type of cluster. The supported values for OCI are: hadoop2 spark . |
hadoop_settings | To change the coordinator node type from the default (Standard_A5), select a different type from the drop-down list. |
spark_settings | Enter the minimum number of worker nodes if you want to change it (the default is 1). |
hadoop_settings¶
Parameter | Description |
---|---|
custom_hadoop_config | The custom Hadoop configuration overrides. The default value is blank. |
fairscheduler_settings | The fair scheduler configuration options. |
fairscheduler_settings¶
Parameter | Description |
---|---|
fairscheduler_config_xml | The XML string, with custom configuration parameters, for the fair scheduler. The default value is blank. |
default_pool | The default pool for the fair scheduler. The default value is blank. |
spark_settings¶
Parameter | Description |
---|---|
zeppelin_interpreter_mode | The default mode is legacy . Set it to user mode if you want the user-level
cluster-resource management on notebooks. See Configuring a Spark Notebook for more
information. |
custom_spark_config | Specify the custom Spark configuration overrides. The default value is blank. |
spark_version | It is the Spark version used on the cluster. The default version is 2.0-latest .
The other supported version is 2.1-latest . |
monitoring¶
Parameter | Description |
---|---|
enable_ganglia_monitoring | Enable Ganglia monitoring for the cluster. The default value is, false . |
security_settings¶
Parameter | Description |
---|---|
ssh_public_key | SSH key to use to login to the instances. The default value is none. (Note: This parameter is not visible to non-admin users.) The SSH key must be in the OpenSSH format and not in the PEM/PKCS format. |
airflow_settings¶
The following table contains engine_config
for an Airflow cluster.
Note
Parameters marked in bold below are mandatory. Others are optional and have default values.
Parameter | Description |
---|---|
dbtap_id | ID of the data store inside QDS. Set it to -1 if you are using the local MySQL instance as the data
store. |
fernet_key | Encryption key for sensitive information inside airflow database. For example, user passwords and connections. It must be a 32 url-safe base64 encoded bytes. |
type | Engine type. It is airflow for an Airflow cluster. |
version | The default version is 1.10.0 (stable version). The other supported stable versions are 1.8.2 and 1.10.2. All the Airflow versions are compatible with MySQL 5.6 or higher. |
airflow_python_version | Supported versions are 3.5 (supported using package management) and 2.7. To know more, see Configuring an Airflow Cluster. |
overrides | Airflow configuration to override the default settings. Use the following syntax for overrides:
|
Request API Syntax¶
If use_account_compute_creds is set to false, then it is not required to set compute credentials.
curl -X POST -H "X-AUTH-TOKEN:$X_AUTH_TOKEN" -H "Content-Type:application/json" -H "Accept: application/json" \
-d '
{
"cloud_config": {
"compute_config": {
"use_account_compute_creds": "<default is ``false``/set it to ``true``>",
"compute_tenant_id": "<tenantID>",
"compute_user_id": "<user ID>",
"compute_key_finger_print": "<key finger print>",
},
"network_config": {
"compartment_id": "<compartment ID>",
"subnet_id": "<subnet-id>",
"vcn_id" : "<vcn-id>",
"Image_id" :""
},
"location" :{
"region" : "us-phoenix-1",
"availability_domain" : "<availability domain>"
}
},
"cluster_info": {
"master_instance_type": "<master instance type>",
"slave_instance_type": "<Slave instance type>",
"label": [ "<label>" ],
"min_nodes": 1,
"max_nodes": <maximum nodes>,
"node_bootstrap": "node_bootstrap.sh",
"disallow_cluster_termination": <false/true>
},
"engine_config": {
"flavour": "<hadoop2/spark>",
"hadoop_settings": {
"custom_hadoop_config": "<hadoop override>",
"fairscheduler_settings": {
"default_pool": "<default pool>"
}
}
}
}' \ "https://oraclecloud.qubole.com/api/v2/clusters"
Sample API Request¶
curl -X POST -H "X-AUTH-TOKEN:$X_AUTH_TOKEN" -H "Content-Type:application/json" -H "Accept: application/json" \
-d '
{
"cloud_config": {
"compute_config": {
"use_account_compute_creds": false,
"compute_tenant_id": "xxx11",
"compute_user_id": "yyyy11",
"compute_key_finger_print": "zzz22",
"compute_api_private_rsa_key": "aaa"
},
"network_config": {
"compartment_id": "abc-compartment",
"subnet_id": "subnet-1",
"vcn_id" : "vcn-1"
"Image_id" :""
},
"location" :{
"region" : "us-phoenix-1",
"availability_domain" : "phx-ad-1"
}
},
"cluster_info": {
"master_instance_type": "<master instance type>",
"slave_instance_type": "<slave instance type>",
"label": [ "oraclehadoop2"],
"min_nodes": 1,
"max_nodes": 4,
"node_bootstrap": "node_bootstrap.sh",
"disallow_cluster_termination": true
},
"engine_config": {
"flavour": "hadoop2",
"hadoop_settings": {
"custom_hadoop_config": "mapred.tasktracker.map.tasks.maximum=3"
}
}
}' \ https://oraclecloud.qubole.com/api/v2/clusters