HMS Liveness

This runbook shows steps to perform when the Hive Metastore Server is no longer live.

  • Check the Hive Metastore logs for evident errors and reason for shutdown/failure.
  • If the monit status says execution failed, then monit failed to restart the process. Restart the process.

Logs

  • HMS logs are available on the coordinator node: “/media/ephemeral0/logs/hive1.2/hive_ms.log”
  • Look for any evident errors(do basic grep and count of errors).
  • Look at the dashboards defined above. (“title” is the name of the dashboard).

Restart of Process

  • sudo monit summary for checking the status of the process.
  • sudo monit stop metastore1_2 for stopping the process.
  • sudo monit start metastore1_2 for starting the process.