Excessive nightly 3am server slow-downs
I have an event detector that raises an alarm if the time stamp register value obtained from a data source does not change in more than one minute (I read that source every second, over Ethernet). Shortly after 3am almost every night, the following occurs:
- The time stamp unchanged alarm is raised,
- The CPU idle goes down
- CPU I/O wait goes up.
- Points to write goes way up
- Disk block writes goes up, but disk block reads goes way up
- Medium Priority Work Items shoots up from zero into the thousands
- What are these Medium Priority Work Items? Why is the peak number so different each night?
- How can I minimize the load or quantity of that 3am thread?
- Can Mango be modified so the Medium Priority Work Items are assigned a priority that is low enough to not interfere with data source reading?
- Can the Medium Priority Work items be staggered so they're not all submitted at 3am?
- Can Mango be modified so that reading a data source is assigned a higher priority thread?
The nightly 'Medium Priority Work Items' that are causing 'datasource time value unchanged for 1 minute' alarms imply that there is a slowdown in the 1 second interval reading of the modbus data source. The problem isn't just the alarm, it is the data dropout that the alarm implies. This appears to be triggered every night by the automatic data purge:
INFO 2013-05-04 03:05:00,001 (com.serotonin.m2m2.rt.maint.DataPurge.executeImpl:60) - Data purge started INFO 2013-05-04 03:08:31,832 (com.serotonin.m2m2.rt.maint.DataPurge.executeImpl:70) - Data purge ended, 2001716 point samples deleted
A simple solution may be to assign a low (or lower) priority to the data purge process, or assign it with ionice so that it does not tax the disk IO. When could something like this be implemented?
I have roughly 300 datapoints using Mango core 2.0.6