We have seen some significant jumps in CPU over the past 6 days. The instance has been running for 57 days utilizing only 2 – 4% CPU, but now the Java process is taking 99% on a Windows server. (We’ve since restarted the Mango service and things are back to normal, but hope to find out what caused the spike to avoid in the future)
We have deployed 12 Modbus devices (water pumps) over the past week, so could it be something related to how we are setting them up?
More details are below:
Java Version: OpenJDK14U-jdk_x64_windows_hotspot_14.0.1_7
• From 2% to 27% on 2/26 @ 16:40 – 51 days uptime with ~2 – 4% CPU usage
• From 27% to 39% on 2/26 @ 19:20
• From 39% to 65% on 2/27 @ 17:00
• From 65% to 100% on 3/2 @ 16:25
12 water pumps using Modbus data sources. Each pump also has a corresponding scripting data source that runs every 3 seconds to see if the user has clicked a button to start or stop the pump. In addition to that, there is 1 Meta data source that houses 5 data points per pump for other various calculated fields (60 total Meta DPs for the 12 pumps).
The IT team gets the data sources configured in Mango along with a dashboard/UI, ready for the field-techs to add an IP address and enable them. Once the devices are enabled, the field techs tweak cell modems, check or uncheck the Encapsulation option if they are RTU or not, and other modifications like that to get Mango reading values properly. While these tweaks are being worked, Mango usually cannot communicate with the cell modem and the slave monitor is 0.
Timing of Pump Data Sources Being Enabled and Worked:
Name - Enable Date
Pump 1 - 2/26 @ 3pm
Pump 2 - 2/27 @ 2:25pm
Pump 3 - 2/27 @ 2:58pm
Pump 4 - 3/1 @ 1:01pm
Pump 5 - 3/1 @ 5:03pm
Pump 6 - 3/2 @ 3:27pm
Pump 7 - 3/2 @ 4:07pm
Pump 8 - 3/2 @ 4:21pm
Pump 9 - 3/3 @ 8:02am
Pump 10 - 3/3 @ 3:30pm
Pump 11 - 3/3 @ 3:47pm
Pump 12 - 3/3 @ 3:56pm
We reviewed the ma.log files and found quite a bit going on. Yesterday the logs were cycling every 3 minutes because they filled up at 5MB.
On 2/26 we started to get the following error every 4/1000 of a second…. 250 per second. It would stop for a while, then come back around the same times the CPU increases would occur.
ERROR 2021-02-26T15:44:12,322 (com.serotonin.m2m2.db.dao.AbstractBasicDao$1.extractData:1063) - Internal Server Error org.eclipse.jetty.io.EofException: Closed com.infiniteautomation.mango.rest.v2.exception.GenericRestException: Internal Server Error org.eclipse.jetty.io.EofException: Closed
We’ve also noticed the following in the log file as well, but not nearly as frequent as the error listed above.
WARN 2021-03-02T15:48:34,770 (org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension$Flusher.failed:455) - java.nio.channels.ClosedChannelException: null
Any help to dig into this is appreciated. Let me know if any additional info is needed and we’ll provide what we can.