SUNScholar/Disaster Recovery/System Monitor
Introduction
Now that you have a large number of servers you would like to know how they are performing and be informed of potential problems. At our library we use munin to do this.
Server Setup
It is assumed that you will be using the same server for monitoring and backup. To setup munin to gather client statistics, follow the procedure below.
Add a firewall rule to allow the server to get the stats:
ufw allow 4949
To install munin, type the following:
apt-get install munin
Add the clients to the /etc/munin/munin.conf file as follows.
Open the file for editing.
nano /etc/munin/munin.conf
Add one of the following for each client:
[%hostname-of-client%]
use_node_name yes
address %ip-address-of-client%
Setup the munin server.
munin-check
Wait for about 5 to 10 minutes for munin to gather data and then check out the stats as follows.
Open a web browser and type the following in the address bar:
http://%hostname-of-monitoring-server%/munin
You should get a page like this:
Client Setup
Login and become the root user.
Click here to create a PostgreSQL credentials file. Then continue.
Install munin as follows:
apt-get install libdbd-pg-perl
apt-get install munin-node
Setup munin to allow the monitoring server to gather statistics as follow:
nano /etc/munin/munin-node.conf
Add the following to the bottom of the file:
allow $ip-address-of-monitoring-server%
Change the following:
host_name %hostname-of-client%
Save the file. Run the following command to check to update stats available:
munin-node-configure --shell | bash -
Check which stats are available
cd /etc/munin/plugins
Add a firewall rule to allow the monitoring server to get the stats:
ufw allow 4949
Thats it. As usual there is a lot of documentation about Munin out there.