Serial port destroy hangs the application
-
Hi,
Im currently developing an application using you library to monitor all the energy meters on our building in real-time. Although i got to say the library is quite well made and architecture, i'm facing a weird problem with it right now.
My program is running ok, and polls about 50 meters, but, sometimes it just hangs/freezes right when i call the method master.destroy.
You can see the log from my program here:
2010-02-03 11:29:21,861 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Frequency :50048 2010-02-03 11:29:22,049 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - #################################### 2010-02-03 11:29:22,096 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - ############################## ID 81 2010-02-03 11:29:22,330 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Energy :409937 2010-02-03 11:29:24,143 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Voltage Phase A :23145 2010-02-03 11:29:24,408 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Voltage Phase B :22818 2010-02-03 11:29:24,674 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Voltage Phase C :22868 2010-02-03 11:29:24,939 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Current Phase A :43608 2010-02-03 11:29:25,205 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Current Phase B :41427 2010-02-03 11:29:25,455 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Current Phase C :41558 2010-02-03 11:29:25,768 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Power Phase A :611 2010-02-03 11:29:26,080 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Power Phase B :583 2010-02-03 11:29:26,346 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Power Phase C :605 2010-02-03 11:29:26,596 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Power Factor Phase A :6200 2010-02-03 11:29:26,861 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Power Factor Phase B :6100 2010-02-03 11:29:27,158 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Power Factor Phase C :6300 2010-02-03 11:29:27,424 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Frequency :50097 2010-02-03 11:29:27,611 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - #################################### 2010-02-03 11:29:27,611 INFO [main] (s3.meters_poller.meters.Acrel3000eMeter) - Destroing the master and the connection
And here is the code that calls the destroy function:
for(int id:metersID){ Unit currUnit = dbp.getOrCreate(ipv6 + ":C" + id); acrel3000eMeterLogger.info("############################## ID " + id); this.readEnergy(id, master, currUnit, dbp); this.readVoltageA(id, master, currUnit, dbp); this.readVoltageB(id, master, currUnit, dbp); this.readVoltageC(id, master, currUnit, dbp); this.readCurrentA(id, master, currUnit, dbp); this.readCurrentB(id, master, currUnit, dbp); this.readCurrentC(id, master, currUnit, dbp); this.readPowerA(id, master, currUnit, dbp); this.readPowerB(id, master, currUnit, dbp); this.readPowerC(id, master, currUnit, dbp); this.readPowerFactorA(id, master, currUnit, dbp); this.readPowerFactorB(id, master, currUnit, dbp); this.readPowerFactorC(id, master, currUnit, dbp); this.readFrequency(id, master, currUnit, dbp); dbp.saveReadings(currUnit); acrel3000eMeterLogger.info("####################################"); } acrel3000eMeterLogger.info("Destroing the master and the connection"); master.destroy(); acrel3000eMeterLogger.info("DONE");
Further investigation of this behavior led me to believe it's a RXTX problem as described [url=http://forums.sun.com/thread.jspa?threadID=5261638]here, [url=http://mailman.qbang.org/pipermail/rxtx/20051229/002014.html]here [url=http://bugzilla.qbang.org/show_bug.cgi?id=46]and here
I've been able to reproduce the problem on windows and not on Linux, but since there are people with the same RXTX problem using Linux and you implementation for both OS should be equal, i'm guessing i've only been lucky. So, can you shed some light on my problem? I really can't afford my program to stop, it has to be allways running.. :roll:
Sorry for the long post,
Miguel. -
Hi Miguel,
This is the first i've heard of the problem. Have you tried any of the solutions that are suggested in the posts that your referenced? If you have something that works for you, let me know.
-
Another thing that comes to mind... Is it absolutely necessary for you to close the port? Maybe you can open it at the start of the application, and only close it when the app shuts down?
-
Since im reading different meters i need to open the serial port with different communication parameters a few times in each cycle, so yeah, i should close it.
Regarding solutions.. The suggested solutions are meant to be applied in the code that handles RXTX (your serotonin package if i remember) so i can't really try them since i don't have access to the code.
So, so far, no solution found.. The only thing i found out is that in Linux it appears to be much more difficult to reproduce the error.
If you could have a look at the suggestions in those threads and the way your package is closing the serial port it would be great(i can also do it if i have access to the code, no problems on that).
-
There's not much to report in that regard. Within seroUtils this is the code used to close the port:
public static void close(SerialPort serialPort) { if (serialPort != null) serialPort.close(); }
None of the code uses the event listeners, so that route is likely a bust. I haven't much time to try the other solutions, and besides, as i mentioned, i've never seen this error myself. You should be able to work with the Modbus4J code and try their suggests.
-
So, i've narrowed it down to what i think is the most probable cause. Unfortunately i can't really try the suggestions for myself as you said because i don't have access to the seroUtils code(have i missed something?).. otherwise i would have moved on with my life happily.
The issue in question:
I had similar problems in my application when using rxtx 2.1.7. The close method would hang forever even when no other thread was reading or writing to the serial port. After reading some rxtx code I traced my problem to the way io locking is done in the rxtx library. Whenever a read or write is called on an RXTXPort then a field called IOLocked will be incremented and when the operation is finished this field will be decremented. However this incrementation and decrementation is not properly synchronized so when running code where one thread reads and another one writes I got the the IOLocked field in a state where the value was != 0. The problem with this is that the close method will wait forever while the IOLocked field has a value != 0.
The workaround for this is a rather nasty hack. Since the IOLocked field has default access then you can modify it from outside the class. So I reset the IOLocked field before the close method is called.
package gnu.io; public final class RXTXHack { private RXTXHack() { } public static void closeRxtxPort(RXTXPort port) { port.IOLocked = 0; port.close(); } }
or:
Hello,
There might be a bullet point to add to the RXTX homepage in relation to the
'deadlock' problem that occurs under the following conditions:- Closing a port from an event listener results in a deadlock.
- Closing a port without adding an event listener results in a deadlock.
namely:
- Closing a port on which there is an open InputStream that is
being repeatedly read results in a deadlock.
or a simillar problem:
I spent quite some time on this bug but it seems very hard to solve: I created
a solution along the lines described in comment #2, but when rxtx hangs in a
write operation it does never come back. It essentially blocks the thread that
does the write for ever AND it prevents a close on the RXTX port:- RXTXPort.SerialOutputStream.write increases the RXTXPort.IOLocked variable
- it then calls the native writeByte or writeArray method (which blocks)
closing the RXTXPort hangs for ever too: - RXTXPort.close() has a
while( IOLocked > 0 ) {..Thread.sleep(500);...}
which spins for ever
==> once the communication hangs, it cannot recover.
Unless there is a communication timeout in RXTX, I don't see a solution to this
problem.Well, I can make it not blocking the UI, but then the RXTXPort gets stuck. That
is probably better than blocking the UI, because the user can at least restart
eclipse.So the suggested solution is reseting the IOLock to 0 before calling the RXTX close method, this because this bug is yet to be solved ([url=http://bugzilla.qbang.org/show_bug.cgi?id=46]link)..
Could you package a version of the seroUtils code that resets the IOLock? I'm not asking you to see if the workaround works as this is very difficult to reproduce, i'll do that on the machine that suffers this problem and report here.
Best Regards,
Miguel. -
The fix attempt doesn't need to be done in seroUtils, since Modbus4J explicitly references RXTX. It could just as easily be done directly in the Modbus4J code.
Regardless, i've added the RXTXHack to seroUtils.jar, and will email it to you.
-
So far, so good.. the fix seems to work as the application has never been so much time working autonomously
since Modbus4J explicitly references RXTX
In which class? i remember following the code flow and seeing it calling the close method of an seroUtils object.Anyway, we've found out how to solve the problem. I'm not sure if this should be included in your future versions since this is a dirty hack to solve a problem that doesn't occurs frequently(only when doing a lot of reads on a loaded machine, preferably using windows :oops: ). It should be documented as a known issue although to save other people a lot of time.
Thanks again for your help.
-
No problem. You did most of the legwork, so thanks for the fix.
-
Hi, I'm developing a sensor application pretty similar to what Miguel runs. I stuck with RxTx closing. It works fine but I just can't close it. And what's interesting, it doesn't hang - it just keeps running.
And there is a thing which confuses me the most - I can't find IOLocked!
import java.io.;
import gnu.io.;
import java.util.*;public class SerialPortHandler {
private SerialPort serialPort;
...my serialPort just doesn't have this parameter.
And another trouble which I would be so happy to resolve - it takes soooo long to initialize the port - maybe 40 seconds or so.
Miguel, how long does your port initialize? Did you have that problem at all?