Consumption of JVM memory leads to disconnect of MQTT data source
-
Heya,
(v3.6.5)
I have been tracking an issue that causes the MQTT data source to disconnect from its broker.
What I have been seeing in the ma.log is:
WARN 2020-02-21T08:38:12,594 (com.serotonin.m2m2.rt.dataSource.PollingDataSource.incrementUnsuccessfulPolls:150) - Data Source Application Virtual aborted 3093 polls since it started. WARN 2020-02-21T08:45:15,170 (com.serotonin.m2m2.rt.dataSource.PollingDataSource.incrementUnsuccessfulPolls:150) - Data Source Mango Internal aborted 1494 polls since it started. WARN 2020-02-21T08:45:15,191 (com.serotonin.m2m2.rt.dataSource.PollingDataSource.incrementUnsuccessfulPolls:150) - Data Source Internal Stats aborted 22 polls since it started. ERROR 2020-02-21T08:45:15,249 (com.infiniteautomation.m2m2.mqtt.client.MqttClientDataSourceRT$1.connectionLost:632) - MQTT Connection Lost org.eclipse.paho.client.mqttv3.MqttException: Connection lost at org.eclipse.paho.client.mqttv3.internal.CommsSender.handleRunException(CommsSender.java:153) ~[?:?] at org.eclipse.paho.client.mqttv3.internal.CommsSender.run(CommsSender.java:138) ~[?:?] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212] Caused by: java.net.SocketException: Broken pipe (Write failed) at java.net.SocketOutputStream.socketWrite0(Native Method) ~[?:1.8.0_212] at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) ~[?:1.8.0_212] at java.net.SocketOutputStream.write(SocketOutputStream.java:155) ~[?:1.8.0_212] at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) ~[?:1.8.0_212] at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) ~[?:1.8.0_212] at org.eclipse.paho.client.mqttv3.internal.wire.MqttOutputStream.flush(MqttOutputStream.java:49) ~[?:?] at org.eclipse.paho.client.mqttv3.internal.CommsSender.run(CommsSender.java:107) ~[?:?] ... 1 more
These errors seemed to occur on a fairly regular timing of about 3.5 hours. It wasn't until yesterday when I got the Mango internal source working that I saw that it was very regular.
The log entries occur when the free memory is "reset" back to 1.2gb. The one anomaly in the middle is when the backup/purge occurs, at which time there are log entries regarding the aborted polling the mqtt lost connection.
I have 2 environment setup with a very similar load:
VM's with 2GB memory allocated, 4GB swap
JVM: ~80% memory usage, 0-15% CPU usage
JAVAOPTS="$JAVAOPTS -Xms1600m -Xmx1600m"
1850 data pointsThe dev environment, running v3.5.6, has a very different free memory usage. Other than log entries occurring at the time of backup/purge, the mqtt connection has not been lost.
(You can see the influence of the backup/purge also in this chart)
I am concerned with the mqtt connection being lost as it is possible that data will be missed. This occurred once during a backup when the connection was lost for 10mins.
Is this issue due to the limited resources (memory) on the VM? I am not convinced it is as the dev environment does not show a similar behaviour. Is this issue due to v3.6.5 (compared to v3.5.6)?
Is this behaviour expected during a backup/purge? Can it be mitigated?
Any ideas would be great.
Thanks
IanPS: as another comparison, I have a similar app running on a physical box with 32gb memory, JVM size of 2.8GB and 2300 data points and have not seen this behaviour (mqtt connection lost) v3.5.6.
-
Heya:
I ran into similar issues with memory usage-- Notably, whenever one of our users ran one of their Big Excel Reports, the Available Memory would go to Zero, and the wheels would come off of Mango. We set our JAVAOPTS to be 10G, and those problems seem to go away.
Mango uses Monster Amounts of Memory under certain conditions-- I'd say that you 1600Meg is too small. I've got site machines running with 2 gigs of RAM, and we use the JAVAOPTS small for that. Those sites only use the Modbus datasource, a handful of metapoints, and the persistent publisher. Even that seems to use all the available memory at times, and once Java takes RAM it never give sit back..
Right now, our main machine is set up with 16 gigs of RAM, with only 174 megs free. The rest of that is JAVA, which is using 14.785 GB of RAM according to TOP.
This peaks whenever a report is ran, but then Java never gives RAM back to the OS after it's taken. Right now, our JVM free memory is about 5 gigs, but that will go to < 2 gigs if one of these big reports gets ran.
Cheers,
-Greg