Watchdog for Mango and it's data sources
-
Hi,
Has anyone created of a good method to monitor if Mango is healthy? I.e. A system that sends notification in the event that Mango's core modules or data sources become non-responsive?Was thinking of creating process events (heartbeats) from data source activity that is used by an external monitoring system for notification.
-
Mango have already provided the datasource event to watchdog itself, you can use the datasource events.
-
Interesting question. I would also like to find a good solution.
At the moment I am using statuscake.com to monitor the external availability of my mango server, though it will really only verify that the web-server is running.
I also use LibreNMS to chart my server stats - memory, disk, CPU, and have been meaning to set up alerts for these.
Perhaps something like a randomly changing watchdog value - which is posted to a remote server using the HTTP Publisher module, then a PHP or python listener script on the remote server alerts you if the value doesn't change in a while.
Or, set up a separate Mango server (on a VPS or something, which should be only a few dollars a month) and use the Persistent TCP publisher and listener modules to send a random number etc. across. You could make the generation of the random number quite complicated (using virtual data sources, meta points, point links, etc.) so that if any sub-system of mango breaks it will stop the value changing. On your remote mango server you could just set up a 'no change' alarm to email you.
-
Using the Mango Persistent TCP data source and publisher is pretty ideal for this. There are a number of things you can publish to a central Mango for monitoring such as a random or incrementing value. Also the Internal data source values such as JVM memory usage, and task lists. There is also a little known data source called the Log4J Data Source which allows you to turn system log messages into Alphanumeric data points. You can have an Error level and a Warning level data point publishing to the monitoring Mango. You can then have a change event detector on this so if there are any Errors that show up in the log file they will be sent to the main server and you will be notified.
It's also our longer term plan to offer a hosted solution specifically for doing this monitoring and alarming and we'll probably package these features up into a simple to install and configure module so you don't need to do much for set up.