Schedule a Jupyter Notebook
- POST /api/v1.2/scheduler
Use this API to schedule a Jupyter notebook. You can view the command’s status, result, or cancel a command using the corresponding Command API that are used for other types of command.
Note
This API is not available by default. Create a ticket with Qubole Support to enable this API on your QDS account.
Required Role
The following users can make this API call:
Users who belong to the system-user or system-admin group.
Users who belong to a group associated with a role that allows update on Jupyter Notebook and directory. See Managing Groups and Managing Roles for more information.
Users who belong to a group associated with a role that allows create on the
Jupyter Notebook
command.
Parameters
Note
Parameters marked in bold below are mandatory. Others are optional and have default values.
Parameter |
Description |
---|---|
name |
Name for the schedule. If name is not specified, then a system-generated Schedule ID is set as the name. |
label |
Label of the cluster on which the Jupyter notebook should be scheduled. |
command_type |
Type of command to be executed. For Jupyter notebook, the command type is |
command |
JSON object that contains path (path including name of the Jupyter notebook to be run with extension ( retry (optional): denotes the number of retries for a job. Valid values are 1, 2, and 3. retry_delay(optional): denotes the time interval (in minutes) between the retries when a job fails. arguments (optional): Valid JSON to be sent to the notebook. Specify the parameters in notebooks and pass the parameter value using the JSON format. key is the parameter’s name and value is the parameter’s value. Supported types in parameters are string, integer, float, and boolean. |
start_time |
Start datetime for the schedule. In the Cron expression, the scheduler calculates the Next Materialized Time (NMT)/Start time considering the current time as the base time and Cron expression passed. Start time is not honored in the Cron expression. |
end_time |
End datetime for the schedule. |
frequency |
Set this option or |
time_unit |
Denotes the time unit for the |
For more information about the schedule parameters, see Scheduler API.
Request API Syntax
Here is the Request API syntax for scheduling a Jupyter notebook.
curl -i -X POST -H "X-AUTH-TOKEN: <token>" -H "Accept: application/json" -H "Content-type: application/json" -d \
'{"command_type":"JupyterNotebookCommand", "command": {"path":"<Path>/<Name>", "retry": 2, "retry_delay": 4, "arguments": {"key1": "value1", …, "keyN": "valueN"}}, "start_time": "2019-12-26T02:00Z","end_time": "2020-07-01T02:00Z","frequency": 1,"time_unit": "days", "label": "<ClusterLabel>"}' \
"https://api.qubole.com/api/v1.2/scheduler"
Note
The above syntax uses https://api.qubole.com as the endpoint. Qubole provides other endpoints to access QDS that are described in Supported Qubole Endpoints on Different Cloud Providers.
Sample API Request
curl -i -X POST -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Accept: application/json" -H "Content-type: application/json" \
-d '{"command_type":"JupyterNotebookCommand", "command": {"path":"Users/[email protected]/note1.ipynb", "retry": 2, "retry_delay": 4, "arguments": {"name": "abc", "age": "20"}}, "start_time": "2019-12-26T02:00Z","end_time": "2020-07-01T02:00Z","frequency": 1,"time_unit": "days", "label": "spark-cluster-1"}' \
"https://api.qubole.com/api/v1.2/scheduler"
Known Limitation
If there is a warning in one of the cells when a scheduled notebook runs, the notebook stops executing at that cell.
As a workaround, to skip the warning and continue execution, add raises-exception
in that cell’s metadata field by performing the following steps:
Select the cell that shows the warning.
Click on the Tools icon on the left side bar.
Click Advanced Tools.
Add
raises-exception
in the Cell Metadata tags field.Re-run the API.