Add monitoring of backup space capacity
Closed, ResolvedPublic

Description

We use backup storage provided by Hetzner for each server. Each has a capacity of 100GB. Today Nephilia's backup space got full. We should have monitoring for the amount of used space in each backup space and alerts if it gets close to full.

I can make a script that gets the amount of space used with lftp, but it seems if I use a telegraf exec input to run the script, I can't control at what time of the day it runs. Best way seems to be to add a socket listener input to telegraf, send stuff there from the script, and add a daily cronjob to run it.

nalvarez created this task.May 2 2020, 5:47 AM
nalvarez triaged this task as Normal priority.
Restricted Application added a subscriber: sysadmin. · View Herald TranscriptMay 2 2020, 5:47 AM

I started writing the script and tested it on edulis. No cronjob to run it yet. One possibility is running it at the end of run-backup.sh instead of having its own cronjob, but there's the risk it might run when another server is still doing backups into the same storage, and then we'd get inconsistent numbers.

This is now active on nicoda (nephilia) and edulis (storagebox-s3). I started logging when backups run, and looks like I do need to run the storage check at different times on different servers to prevent catching it mid-backup.

This is now working for all active backup spaces. I'm not only logging the total space used, but also the size of subdirectories up to depth 2, which lets us do this: