Receive an e-mail alert about stopped docker container

If you are a happy user of hass.io you probably know that your setup consists of multiple docker containers. There is a container for home-assistant itself, supervisor container which controls the process of installation and upgrade HA software, one or more dedicated containers for hass.io addons. On top of that you may have your own containers with additional software, e.g. zerotier-one for a private VPN or distinct mosquitto broker.

In some circumstances one or multiple containers may stop working or even come into an infinite loop of restarting attempts. The most often case for me was when a container has not been set up for automatic restart and after host is restarted (or docker daemon restarted) the container remained offline. It is okay if you will be the one who noticed the interruption first. if it is your wife, the WAF of your home automation solution may be down to about zero.

Apparently you cannot use home-assistant automation rules to monitor its availability so we need something which may work independently. Something very robust and stable. The best and well known choice I come up with is the monit.

The basic idea is fairly simple - you create a shell script which asks docker for a list of running containers. If some container names missing, script returns non-zero error code. Monit executes the script on a regular basis and if return code is not zero, sends alert notification to you (via email or other communication means).

Shell script

We will start with the shell script which will check docker containers status:

$ mkdir ~/monit
$ cd ~/monit
$ touch monit_docker_check.sh
$ chmod +x monit_docker_check.sh

Put the following content in the file, replace container names with your own.

#!/bin/sh
list_of_containers="homeassistant addon_a0d7b954_appdaemon3 hassio_supervisor"
containers=`docker ps -f status=running --format "{{.Names}}"`
for container in $list_of_containers
do
  if echo $containers |grep -q $container
    then  echo "$container online "
  else echo "$container offline"
    exit 1
  fi
done
exit 0

To check if script works, run it and check the error code. In case some containers are stopped, exit code should be 1.

$ ./monit_docker_check.sh
appdaemon online
hassio_supervisor online
homeassistant offline
$ echo $?
1

Set up email configuration for monit

If you did not configured your monit yet, here is how to set up email configuration. In most linux systems monit configuration file located at: /etc/monit/monitrc Open the file in the editor of your choice:

$ sudo nano /etc/monit/monitrc

and add the following lines:

set mailserver smtp.gmail.com port 587
    username "user" password "password"
    using tlsv1

set alert mymail@gmail.com

Note that you should obtain an application password from google, it is unlikely that your regular gmail password will work here. You can test your email settings by fire up a test notification. Add the following statement to monitrc:

check file alerttest with path /.nonexistent
   alert mymail@gmail.com

Reload your monit with command:

$ sudo monit reload
Reinitializing monit daemon

Soon you should see an email alert in your mailbox.

Add a check rule to monitrc

Now we have to tell monit to run shell script periodically and send us an alert if some of containers are not running. Add the following to monitrc (replace /home/user/monit with the actual location of your shell script):

check program docker_check with path /home/user/monit/monit_docker_check.sh
with timeout 500 seconds
if status = 1 then alert

Here you tell monit to check your docker containers status every 500 seconds and send you an alert if at least one is not running. Reload monit configuration with the following command:

$ sudo monit reload
Reinitializing monit daemon 

Check if it works

Try to stop a container with docker stop command or just add fake container name into shell script file. Fter some time monit should send you a notification about stopped container. When the container started to work again, another email will be sent to inform you that everything works fine now.

Don’t forget to add new containers to the list so that monit can control them as well.