Creating a New Schedule¶
Navigate to the Scheduler page and click the Create button in the left pane to create a schedule.
Press Ctrl + / to see the list of available keyboard shortcuts. See Using Keyboard Shortcuts for more information.
The schedule fields are displayed that contains General Tab, a query composer, Macros, Schedule, and Notifications.
Perform the following steps to create a schedule:
- Setting Parameters in the General Tab
- Adding a Query in Query Composer
- Adding Macros
- Setting Schedule Parameters
- Setting Notifications
There is a rerun limit for schedule reruns to be processed concurrently at a given point of time. Understanding the Qubole Scheduler Concepts provides more information.
Setting Parameters in the General Tab¶
The General tab is as shown in the following figure.
In General Tab:
Enter a name in the Schedule Name* text field. This field is optional. If it is left blank, a system-generated ID is set as the schedule name.
In the Tags text field, add one or a maximum of six tags to group commands together. Tags help in identifying commands. Each tag can contain a maximum of 20 characters. It is an optional field. To add a tag, follow these steps:
In the Tags field, add a tag as shown below.
After adding a tag, press Enter and you can see the tag being added as shown below.
Similarly, you can add additional tags (total number of tags must be 6). You can add tags in a new schedule and in an existing schedule by editing it.
Adding a Query in Query Composer¶
To add a query, perform these steps:
Select a query type from the drop-down list. If there is any sub option for query type, select it.
Select a cluster on which you want to run the query.
Select the number of command retries from the Retry drop-down list. This option is available for almost all the command types (except Db Query, Redshift Query, Refresh Table, and Workflow). But Retry is available on subcommand level under the Workflow command.
Configuring retry initiates a blind retry of the command. This may lead to data corruption if the command execution fails and writes partial data. For example, a retry of failed
INSERT INTOquery can lead to data corruption.
- Select the duration from the Delay (mins.) drop-down list to specify the time interval between the retries when a job fails.
- Type the query in the text field.
See Creating a Spark Schedule for more information on how to create a Spark schedule and also schedule running a Spark notebook.
If you have used macros in the query, click the + button available in the Macros field. Else, proceed to the next step. After you click the + button, the macros are displayed as shown in the following figure.
Enter the variable name and value in the corresponding text fields. See Macros in Scheduler for more information. Click + to add another macro. Else, proceed to the next step.
Setting Schedule Parameters¶
The Schedule field contains Frequency, Time Zone, and Advanced Settings. For more information, see Understanding the Qubole Scheduler Concepts.
The following figure illustrates all parameters in Schedule.
Use the tooltip to know more information on each field.
In the Schedule field, set:
Frequency: Enter the periodicity or custom or a cron expression from the corresponding drop-down list. The drop-down list of frequency is illustrated in the following figure.
Selecting Cron expression is useful to set exact date/time. A sample cron expression is illustrated in the following figure.
Enter the values in all the cron expression fields.
The start time by selecting the year, month, date and time (HH:MM) from the corresponding drop-down lists.
The end time by selecting the year, month, date, and time (HH:MM) from the corresponding drop-down lists.
Time Zone by selecting the appropriate timezone from the drop-down list.
Command Timeout - You can set the command timeout configurable in hours and minutes. Its default value is 36 hours (129600 seconds) and any other value that you set must be less than 36 hours. QDS checks the timeout for a command every 60 seconds. If the timeout is set for 80 seconds, the command gets killed in the next minute that is after 120 seconds. By setting this parameter, you can avoid the command from running for 36 hours.
Advanced Settings when expanded displays:
Fair Scheduler pool: Enter the fairscheduler pool name in the text field.
Concurrency: Select the number of concurrent schedules allowed from the Concurrency drop-down list if you do not want the default value.
Dependencies: It has three options to be set for a schedule:
- No Dependency (selected by default)
- Wait for Hive Partition. See Configuring Hive Tables Data Dependency for more information.
- Wait for S3 Files. See Configuring S3/Azure Blob Storage Files Data Dependency for more information.
Skip Missed Instances: Select Skip Missed Instances if you want to skip instances that were supposed to have run in the past. By default, this option is unselected. When a new schedule is created, the scheduler runs schedule actions from start time to the current time. For example, if a daily schedule is created from Jan 1 2015 on May 1 2015, schedule actions are run for Jan 1 2015, Jan 2 2015, and so on. If you do not want the scheduler to run the missed schedule actions for months earlier to May, select the check box to skip them.
The main use of skipping a missed schedule action is if when you suspend a schedule and resume it later, in which case, there will be more than one missed schedule action and you might want to skip the earlier schedule actions.
For more information, see Understanding the Qubole Scheduler Concepts.
Notification is an optional field to be selected if you want to be notified through email about instance failure. Once you select the Send notifications check box, Notification Type, Notification List, and Event are displayed.
Select the Notification Type option, Daily digest to receive daily digests if a schedule periodicity is in minutes or hours. The default notification type is Immediate.
By default, On Failure is selected. Select On Success to be notified about successful schedule actions. You can select both type of events or any one of them.
Select the Notification Channel from the Notification List field. Notification List displays the list of Notification Channels configured. For more information on how to create a Notification Channel, see Creating Notification Channels.
After setting parameters, click Save to add a new schedule after you are done with filling the required details. Click Cancel if you do not want to create a schedule.