system-diskfullmaster

This runbook shows the steps required to find the disk space used in a partition of the cluster’s master node.

Alert Name: system-diskfullmaster

Alert Condition: The condition that triggers the alert is avg(last_5m):avg:disk_free{cluster-57244} by {host} / avg:disk_total{cluster-57244} by {host} < 0.2.

Alert Explanation: The alert indicates that the average free disk space on the cluster’s master node over the last 5 minutes is less than 20% of the total disk space. (In the above alert condition, 57244 is the cluster’s ID that can vary as each cluster has a unique cluster ID.) The threshold values that is the average of 5 minutes ((last_5m)) and less than 20% (< 0.2) only vary when the existing alert condition is modified. The changed alert condition applies to the clusters associated with the alert and it remains constant unless the threshold values in the alert condition are changed.

Resolution:

Step 1

Log into the master node of the cluster and see which mountpoint is full by running this command.

# df -kh

Step 2

After identifying the mountpoint, run this command to find out the larger directories under the mountpoint.

# du -a <mountpoint> | sort -n -r | head -n 5

The above command lists the larger directories of the <mountpoint>.

To view the larger directories in the current working directory, then go to the current working directory and run this command.

<CurrentWorkingDirectory># du -a | sort -n -r | head -n 5

Run man du for details on various command options.

You can also see the larger directories in KB, MB, or GB by running this command.

<CurrentWorkingDirectory># du -hs * | sort -rh | head -5

The above command displays the directories that use the most disk space. You can delete directories or sub-directories that you do not need to create the free disk space.

To view the largest folders/files including the sub-directories in a directory, say <WorkingDirectory>, run:

<WorkingDirectory># du -Sh | sort -rh | head -5

To view the larger file sizes only in a directory, say <WorkingDirectory>, then run the following command:

<WorkingDirectory># find -type f -exec du -Sh {} + | sort -rh | head -n 5

To find the larger files in a specific location, say <WorkingDirectory>, include the path besides the find command:

<WorkingDirectory># find /<mountpoint>/<folder>/<subfolder>/ -type f -exec du -Sh {} + | sort -rh | head -n 5

OR

<WorkingDirectory># find /<mountpoint>/<folder>/<subfolder>/ -type f -printf "%s %p\n" | sort -rn | head -n 5

Step 3

After finding the files that have less free space, you can zip, archive, or just remove such files.