Submit a Workflow Command
- POST /api/v1.2/commands/
This API is used to submit a workflow command. A workflow command contains an array of sub-commands to run as part of a single command.
Required Role
The following users can make this API call:
Users who belong to the system-user or system-admin group.
Users who belong to a group associated with a role that allows submitting a command. See Managing Groups and Managing Roles for more information.
Parameters
Note
Parameters marked in bold below are mandatory. Others are optional and have default values.
Parameter |
Description |
---|---|
sub_commands |
Array of sub-commands to run as part of this command. Commands will be executed in the sequence they are specified in. Options specific to command types (for example, query for hive commands or inline for shell commands) can be provided in the individual sub-command definitions. |
command_type |
CompositeCommand |
label |
Specify the cluster label on which this command is to be run. |
name |
Add a name to the command that is useful while filtering commands from the command history. It does not accept & (ampersand), < (lesser than), > (greater than), “ (double quotes), and ‘ (single quote) special characters, and HTML tags as well. It can contain a maximum of 255 characters. |
tags |
Add a tag to a command so that it is easily identifiable and searchable from the commands list in the Commands History. Add a tag as a filter value while searching commands.
It can contain a maximum of 255 characters. A comma-separated list of tags can be associated with a single command.
While adding a tag value, enclose it in square brackets. For example, |
timeout |
It is a timeout for command execution that you can set in seconds. Its default value is 129600 seconds (36 hours). QDS checks the timeout for a command every 60 seconds. If the timeout is set for 80 seconds, the command gets killed in the next minute that is after 120 seconds. By setting this parameter, you can avoid the command from running for 36 hours. |
Examples
Goal: Run the terasort benchmark
curl -X POST -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Content-Type: application/json" \ -d '{
"sub_commands": [
{
"inline": "if hadoop fs -ls /user/hduser/terasort-input; then\n hadoop fs -rmr /user/hduser/terasort-input\nfi\n\nif hadoop fs -ls /user/hduser/terasort-output; then\n hadoop fs -rmr /user/hduser/terasort-output\nfi",
"command_type": "ShellCommand"
},
{
"inline": "#!/bin/bash\n\nNUM_MAP_TASKS=20\n\n# Total data generated = (DATA_SIZE * 100) bytes\n# Keep DATA_SIZE at 1000000000 to generate 100GB for terasort\nDATA_SIZE=5000000000\n\nhadoop jar /usr/lib/hadoop/hadoop-0.20.1-dev-examples.jar teragen -Dmapred.map.tasks=${NUM_MAP_TASKS} ${DATA_SIZE} /user/hduser/terasort-input",
"command_type": "ShellCommand"
},
{
"inline": "#!/bin/bash\n\nNUM_REDUCE_TASKS=20\nNUM_MAP_TASKS=20\nhadoop jar /usr/lib/hadoop/hadoop-0.20.1-dev-examples.jar terasort -Dmapred.map.tasks=${NUM_MAP_TASKS} -Dmapred.reduce.tasks=${NUM_REDUCE_TASKS} /user/hduser/terasort-input/ /user/hduser/terasort-output",
"command_type": "ShellCommand"
}
],
"command_type": "CompositeCommand"
}' \
"https://api.qubole.com/api/v1.2/commands"
Note
The above syntax uses https://api.qubole.com as the endpoint. Qubole provides other endpoints to access QDS that are described in Supported Qubole Endpoints on Different Cloud Providers.
Response:
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
{
"timeout": null,
"template": "generic",
"resolved_macros": null,
"status": "waiting",
"qbol_session_id": 974,
"progress": 0,
"qlog": null,
"can_notify": false,
"end_time": null,
"start_time": null,
"user_id": 10,
"label": "default",
"command": {
"sub_commands": [
{
"status": "waiting",
"command": {
"parameters": null,
"archives": null,
"inline": "if hadoop fs -ls /user/hduser/terasort-input; then\n hadoop fs -rmr /user/hduser/terasort-input\nfi\n\nif hadoop fs -ls /user/hduser/terasort-output; then\n hadoop fs -rmr /user/hduser/terasort-output\nfi",
"script_location": null,
"files": null
},
"start_time": null,
"end_time": null,
"sequence_number": 1,
"pid": null,
"id": 50,
"command_type": "ShellCommand"
},
{
"status": "waiting",
"command": {
"parameters": null,
"archives": null,
"inline": "#!/bin/bash\n\nNUM_MAP_TASKS=20\n\n# Total data generated = (DATA_SIZE * 100) bytes\n# Keep DATA_SIZE at 1000000000 to generate 100GB for terasort\nDATA_SIZE=5000000000\n\nhadoop jar /usr/lib/hadoop/hadoop-0.20.1-dev-examples.jar teragen -Dmapred.map.tasks=${NUM_MAP_TASKS} ${DATA_SIZE} /user/hduser/terasort-input",
"script_location": null,
"files": null
},
"start_time": null,
"end_time": null,
"sequence_number": 2,
"pid": null,
"id": 51,
"command_type": "ShellCommand"
},
{
"status": "waiting",
"command": {
"parameters": null,
"archives": null,
"inline": "#!/bin/bash\n\nNUM_REDUCE_TASKS=20\nNUM_MAP_TASKS=20\nhadoop jar /usr/lib/hadoop/hadoop-0.20.1-dev-examples.jar terasort -Dmapred.map.tasks=${NUM_MAP_TASKS} -Dmapred.reduce.tasks=${NUM_REDUCE_TASKS} /user/hduser/terasort-input/ /user/hduser/terasort-output",
"script_location": null,
"files": null
},
"start_time": null,
"end_time": null,
"sequence_number": 3,
"pid": null,
"id": 52,
"command_type": "ShellCommand"
}
]
},
"pool": null,
"account_id": 10,
"num_result_dir": -1,
"pid": null,
"created_at": "2015-01-27T14:31:00Z",
"name": null,
"submit_time": 1422369060,
"path": "/tmp/2015-01-27/10/2946",
"id": 2946,
"command_source": "API",
"command_type": "CompositeCommand",
"meta_data": {
"results_resource": "commands/2946/results",
"logs_resource": "commands/2946/logs"
}
}