Modbus TCP no recipient found for response

jvaughters

You will have to isolate the inverter from Mango to know for sure. The first thing I would do is leave the configuration as it is and run a constant ping script and log the results. See if the response times on the ping grows during the failure times in Mango. If you are getting ping failures or long responses during the same time as Mango, it is likely the inverter software lagging or network latency issues. The next more drastic isolation would be to use another Modbus client and also the ping test at the same time and see if you get errors as well. Again logging the errors. I've used the original Mango, ScadaBR and now testing the new M2M2 mango for nearly 10 years and most of the time it is not Mango, but the device or network that is the issue.

phildunlap

Hi rob987,

Do you have "Contiguous batches only" checked? It would appear to me you probably do, and that part of the problem may be segmenting the requests for those points into several individual requests, as can be seen in the I/O sample you shared. That's why there's more than one request per second but they're all in the holding registers and only somewhat separated. I suspect disabling that, or increasing the max registers per request, could allow a more efficient polling of the device.

You can see in the timing there that Mango is quite ready to send the next request, but it's waiting for the device to reply.

To the no recipient message, that would be caused by very close timing where the message has been received but just after the timeout happened, which may contribute to its somewhat randomness as to time of occurrence.

jvaughters

Interesting view @phildunlap. One of these days I will have to Wireshark Mango polls and play with those settings to see how Mango tries to optimize. But even if that is checked or not, if the inverter has a good network connection and the software is responsive, it should be able to respond quickly. It is very few points. I'm glad you pointed that out though, something for me to check out one day and it is certainly worth a try on this case.

@rob987 - Is this a wired or wireless connection?

Also, I have found that small resource embedded devices can get clogged up with actions that can slow the response. Can you correlate any inverter actions with the failures?

phildunlap

Interesting view @phildunlap. One of these days I will have to Wireshark Mango polls and play with those settings to see how Mango tries to optimize.

Behold! https://github.com/infiniteautomation/modbus4j/blob/master/src/com/serotonin/modbus4j/BatchRead.java#L180

But even if that is checked or not, if the inverter has a good network connection and the software is responsive, it should be able to respond quickly.

We don't need to speculate. From the I/O log we can see it's taking 90 - 250 ms (and more) to respond to each request.

jvaughters

Awesome, thanks for the code reference. I am just now realizing that log is the Modbus request and responses. I was looking at that log wondering what it was, then it hit me it's Modbus req/rep. How do you get Mango to log that I/O? That could be useful.

jvaughters

So in this case, why did it send out the exact same request in less than a ms? and other times I see it in 1 ms. But why the exact same request in such a short time?

2019/03/22-21:06:43,356 O 000c00000006010300e20002
2019/03/22-21:06:44,356 O 000c00000006010300e20002

phildunlap

Ah, you found the option before I could link and proclaim "Behold!" again. But...

Behold!

2019/03/22-21:06:43,356 O 000c00000006010300e20002

2019/03/22-21:06:44,356 O 000c00000006010300e20002

I see a one second gap. Might that be your timeout?

jvaughters

Nope, behold was needed, I was just using the log above, it is 1 sec gaps, so that is just a retry and makes sense. I missed the increment. So the question is still why is it needing to retry? Something is holding it up. Network or processor?

Thx for the beholds too `,~)

phildunlap

Thx for the beholds too `,~)

Can't hold them in, glad they're appreciated!

I would wager it's the device's ability to respond. It's not impossible it's been taught to briefly ignore connections from a particular host if it thinks it's getting flooded, or even just a max requests per interval from anyone. It could also be that it's pausing for something like a garbage collection if it were a Java application. No ways of knowing really. Probably unlikely it's the network since doing Modbus over the internet is usually not ideal (no vaildation, but a VPN is fine for low security) so it's probably very small packets over LAN, but it's certainly possible it's some wireless card failing to receive or send something. Not knowable offhand.

rob987

Hi,
Thanks for all the input. 247 errors yesterday (27/03/2019) on a cloudy rainy day, so the inverter does not rest :-),
and devote more time to comms.

The inverter is on a wired ethernet connection, with mango and the inverter on the wired LAN.

I had "Contiguous batches only" ticked, as the inverter would not answer back when I first added it. I have now ticked it and reduced the max read registers to 50. The This is giving me 2 requests.

2019/03/28-08:33:28,568 O 00230000000601030053000d
2019/03/28-08:33:28,833 I 00230000001d01031a22e2ffff1384fffe2494ffff91d8fffedac6fffe006aeaf70000
2019/03/28-08:33:28,834 O 002400000006010300ce0025
2019/03/28-08:33:28,914 I 00240000004d01034a033b012300f7011f00000388012c011d01440000fe93ffb7ff73ff6a0000de17db9ee036de71fffe004162a600191f070015cec0001e39760044aaeb002c0006001e72c40005fcb80000

What is the ,nnn after the date-time in the above log (2019/03/28-08:33:28,568). Is it mS timestamp of Java sending and receiving?

Is there any way to have a poll period associated with a point.? The 2nd longer read is for kWh values, and could be read every 15 minutes. This might allow the instant power values to be read more frequently.

I am using mango to read, and publish the values to a Click PLC, as I cannot get the Click to talk to the inverter. I used wireshark on the line, and I see the PLC send 3 SYN requests, and the inverter does not reply. I suspect it may be "to fast" for the inverter.

No errors for 30 minutes, so maybe solved, but not fixed !!!!!!
Thanks

jvaughters

The nnn is ms
The only way I know how to have separate point scan times is to create multiple data sources with different scan times. I will let Mango staff have the last word on that. This could be a problem if the inverter does not allow more than one TCP/IP session.
That "Contiguous Batch Only" definitely changed the request to ask for larger blocks of data. This may be more friendly to the processor. Meaning less requests to handle.

Recommendations to try:

Try to set the Transport Type to TCP with keep-alive if it is not already. I think that should be the default. I have run into issues where performance is affected if that is not set. The constant TCP session set up and tear down is wasted load time.
If the inverter allows more than one TCP/IP session then create a second data source that polls at the 15 min poll rate. If there is another way to do this, maybe the Mango folks will chime in. All settings should be the same for each data source except the points and poll rate.
It's highly unusual, but it is possible that the PLC and the inverter have differing TCP stack software. This can lead to devices not talking. I have seen this before in embedded devices that are old. If this is true you could break down the TCP sections from mango and compare them to the PLC TCP packets. It's not likely, but possible. You could also check every bit sent from Mango and compare to every bit sent from the PLC and see how they differ. It can be grueling and may not be worth the time, but if you really want to know this is an option.

phildunlap

The only way I know how to have separate point scan times is to create multiple data sources with different scan times. I will let Mango staff have the last word on that. This could be a problem if the inverter does not allow more than one TCP/IP session.

:-/ "let"

That's one way. The other way would be a script calling RuntimeManager.refreshDataPoint("DP_XID") (not applicable to that question, but there's also a RuntimeManager.refreshDataSource("DS_XID") ). One could also PUT the REST endpoint /rest/v1/runtime-manager/force-refresh/{xid} or pay someone to mash the refresh swish on the data point details page:

You'd need a logging type other than interval to see this density of polling different longer term.

That "Contiguous Batch Only" definitely changed the request to ask for larger blocks of data. This may be more friendly to the processor. Meaning less requests to handle.

Not having that setting ticked generally leads to larger requests. While it is possible that you can get larger requests with that setting, it requires a contrived register mapping and maximum registers. Contiguous batches only will mean only registers for which there are data points configured will have their values requested. Non-contiguous means it'll try to read registers in between if it's possible without exceeding the max read registers or bits, and the function i linked you to is a greedy method in constructing the read groups.

jvaughters

@phildunlap

Correct, my assumption was he unchecked the "Contiguous Batch Only" and it resulted in two requests for 13 and 37 registers per request as opposed to 2 registers per request based on the Modbus logging, which I dearly love that I/O feature. I'm totally lost on the alternative point polling periods you proposed. And that is ok, because I do not need that feature in Mango and will be sure to contact you if I ever do.

From a general SCADA perspective. Multiple polling rates for different points is handled differently across different systems. Some take over completely and optimize as they see fit and some do that VERY well. Others and my favorite is to create scan groups where you specify the points you want per group and what scan rate you would like for that group. Then the same TCP/IP session can be used by Modbus master to scan the groups as specified. Not sure if Mango would ever consider the scan group idea, but I would certainly like that feature.

jvaughters

@phildunlap

Ah! I do get what you are saying about scripting the update. So you could create a timer and use that to run a script to refresh. So your data source could be set to the longest you wanted for a refresh and you could create a script to refresh at shorter intervals. That actually is a nice work around and, oddly it mimics the scan group feature I discussed. It does lead to a question.

Will the scripted refresh be a single point only, or could you pass it a group of points?
Would that bypass the Modbus attempt to optimize requests?

phildunlap

Will the scripted refresh be a single point only, or could you pass it a group of points?
Would that bypass the Modbus attempt to optimize requests?

it would be single requests for points, good observation! One could construct BatchRead objects manually if there were a need to do so, like this person is doing (but I don't think they're using Mango) https://forum.infiniteautomation.com/topic/2574/how-to-cast-batchresults-getvalue-n-property-to-int-in-modbus-tcp-ip or splitting the points across the different data sources would enable you to refresh the whole data source to poll those points together in the same request. Separating them into different data sources and just using polling rate would probably be the best easy tactic, but if an event would trigger a temporary faster polling time I could certainly see scripting that, like we've discussed here or as in this thread where the data source's poll period is adjusted in an event handler: https://forum.infiniteautomation.com/topic/3288/alter-data-source-update-time-via-event-handler

jvaughters

However, the refresh work around may be the only way to do it for a device that only allows a single TCP/IP session, so that is a good technique to remember and keep in my tool box. Plus, single point MODBUS requests are not the most efficient, but in reality the load increase is very small in most cases. I like it, thx.

rob987

Hi,
Just reporting that no errors in the last 24 hours,
I have the TCP with keepalive set, Comms was very unreliable with a new connection for each request.
Now the forest of errors has disappeared, I notice I get a "Connect timeout" twice a day, about an hour after sunset, and an hour before sunrise. I suspect a quick fix for a leaky system, as the inverter reboots twice a day.

The inverter only accepts one modbus connection at a time, so a second device is not an option.

The instant power value swings very quickly, as clouds move accross the sky, and the fastest update time I can get allows he PLC to adjust the load on 2 x hot water system elements via SSRs. The kWh value is only really used to display "daily usage" of 15 - 30 kWh. It could be read every 15 - 30 minutes.

I have a support request open with solaredge, and have sent then the wireshark logs of a working modpoll connection from a PC, and a PLC connection showing only 3 x SYN request. I will sent them my findings from this, but I will ask for a "stress test" on the next inverter I purchase.

Thanks for all your help

phildunlap

Hi rob987,

Thanks for all your help

Thank you for the very thorough first post!

The kWh value is only really used to display "daily usage" of 15 - 30 kWh. It could be read every 15 - 30 minutes.

Another possibility I forgot to mention would be using a scripting data source to enable, wait, and then disable the points that don't need the faster poll rate, and then speed up the polling on the data source normally.