Understanding Cluster Failure Notifications¶
After you configure a notification channel for a cluster, you get notified regarding that cluster. To know more about how to configure a notification channel for a cluster, see Creating Notification Channels and Advanced configuration: Modifying Cluster Monitoring Settings.
Qubole has classified its cluster notifications into three different categories based on the severity level, which are:
By default, Qubole sends notifications that fall under the
Error severity level. You can contact
Qubole Support if you want notifications that have
as the severity levels. The notification has
[Level] [ID: Notification_id] Qubole Cluster <cluster_label> as the
prefix in its subject.
This topic helps you to understand the purpose of notifications that you receive upon the cluster failure.
- Cluster Start Failure: Qubole sends this notification when the cluster fails to start.
- Cluster Terminate Failure: Qubole sends this notification whenever a cluster termination fails.
- Cluster Upscaling Failure: Qubole sends this notification when the cluster upscaling fails.
- Cluster Downscaling Failure: Qubole sends this notification when the cluster downscaling fails.
- Cluster Spotloss: Qubole sends this notification when the cluster is facing frequent spot loss. It is disabled by default and you can get it enabled by creating a ticket with Qubole Support.
- Cluster Health Check Failure: Qubole sends this notification when the Health check is failed on the cluster. This notification has details on causes of the cluster health checks failure.
- Cluster Timed out: Qubole sends this notification if the cluster starts does not start within
10minutes (default value). You can create a ticket with Qubole Support if you want to change the default value.