Schedule a Jupyter Notebook

POST /api/v1.2/scheduler

Use this API to schedule a Jupyter notebook. You can view the command’s status, result, or cancel a command using the corresponding Command API that are used for other types of command.

Note

This API is not available by default. Create a ticket with Qubole Support to enable this API on your QDS account.

Required Role

The following users can make this API call:

  • Users who belong to the system-user or system-admin group.

  • Users who belong to a group associated with a role that allows update on Jupyter Notebook and directory. See Managing Groups and Managing Roles for more information.

  • Users who belong to a group associated with a role that allows create on the Jupyter Notebook command.

Parameters

Note

Parameters marked in bold below are mandatory. Others are optional and have default values.

Parameter

Description

name

Name for the schedule. If name is not specified, then a system-generated Schedule ID is set as the name.

label

Label of the cluster on which the Jupyter notebook should be scheduled.

command_type

Type of command to be executed. For Jupyter notebook, the command type is JupyterNotebookCommand.

command

JSON object that contains path (path including name of the Jupyter notebook to be run with extension (.ipynb).

retry (optional): denotes the number of retries for a job. Valid values are 1, 2, and 3.

retry_delay(optional): denotes the time interval (in minutes) between the retries when a job fails.

arguments (optional): Valid JSON to be sent to the notebook. Specify the parameters in notebooks and pass the parameter value using the JSON format. key is the parameter’s name and value is the parameter’s value. Supported types in parameters are string, integer, float, and boolean.

start_time

Start datetime for the schedule. In the Cron expression, the scheduler calculates the Next Materialized Time (NMT)/Start time considering the current time as the base time and Cron expression passed. Start time is not honored in the Cron expression.

end_time

End datetime for the schedule.

frequency

Set this option or cron_expression but do not set both options. Specify how often the schedule should run. Input is an integer. For example, frequency of one hour/day/month is represented as {"frequency":"1"}

time_unit

Denotes the time unit for the frequency. Its default value is days. Accepted value is minutes, hours, days, weeks, or months.

For more information about the schedule parameters, see Scheduler API.

Request API Syntax

Here is the Request API syntax for scheduling a Jupyter notebook.

curl -i -X POST -H "X-AUTH-TOKEN: <token>" -H "Accept: application/json" -H "Content-type: application/json" -d \
 '{"command_type":"JupyterNotebookCommand", "command": {"path":"<Path>/<Name>", "retry": 2, "retry_delay": 4, "arguments": {"key1": "value1", …, "keyN": "valueN"}}, "start_time": "2019-12-26T02:00Z","end_time": "2020-07-01T02:00Z","frequency": 1,"time_unit": "days", "label": "<ClusterLabel>"}' \
 "https://api.qubole.com/api/v1.2/scheduler"

Note

The above syntax uses https://api.qubole.com as the endpoint. Qubole provides other endpoints to access QDS that are described in Supported Qubole Endpoints on Different Cloud Providers.

Sample API Request

curl -i -X POST -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Accept: application/json" -H "Content-type: application/json" \
-d  '{"command_type":"JupyterNotebookCommand", "command": {"path":"Users/[email protected]/note1.ipynb", "retry": 2, "retry_delay": 4, "arguments": {"name": "abc", "age": "20"}}, "start_time": "2019-12-26T02:00Z","end_time": "2020-07-01T02:00Z","frequency": 1,"time_unit": "days", "label": "spark-cluster-1"}' \
"https://api.qubole.com/api/v1.2/scheduler"

Known Limitation

If there is a warning in one of the cells when a scheduled notebook runs, the notebook stops executing at that cell. As a workaround, to skip the warning and continue execution, add raises-exception in that cell’s metadata field by performing the following steps:

  1. Select the cell that shows the warning.

  2. Click on the Tools icon on the left side bar.

  3. Click Advanced Tools.

  4. Add raises-exception in the Cell Metadata tags field.

  5. Re-run the API.