zeppelin_heap_memory_alert

This runbook shows the steps you must perform when the zeppelin_heap_memory_alert is triggered.

Alert Name: zeppelin_heap_memory_alert

Alert Condition: The condition that triggers the alert is avg(last_5m):zeppelin.heap.usage{cluster-10} by {host} > 0.8.

Alert Explanation: The alert indicates that the average heap usage of zeppelin jvm over the last 5 mins is greater than 80% of the allocated heap. Continuous increase in this might cause the Zeppelin server to crash. In the above alert condition, 10 is the cluster’s ID that can vary as each cluster has a unique cluster ID. 0.8 is the default value of zeppelin.heap_usage_threshold. You can contact Qubole Support to modify the zeppelin.heap_usage_threshold value at the account level. However, it is recommended to set the value as 0.8.

Resolution:

Assign an instance with more memory to the coordinator node.

Steps

  1. Navigate to the Clusters page and stop the cluster.
  2. Click on the Edit button next to the cluster.
  3. On the Edit Cluster Settings, click on the Configuration tab.
  4. From the Coordinator Node Type dropdown list, select an instance with larger RAM size than the current instance.
  5. Click Update to save the changes.
  6. Start the cluster.