• Recent
    • Tags
    • Popular
    • Register
    • Login

    Please Note This forum exists for community support for the Mango product family and the Radix IoT Platform. Although Radix IoT employees participate in this forum from time to time, there is no guarantee of a response to anything posted here, nor can Radix IoT, LLC guarantee the accuracy of any information expressed or conveyed. Specific project questions from customers with active support contracts are asked to send requests to support@radixiot.com.

    Radix IoT Website Mango 3 Documentation Website Mango 4 Documentation Website Mango 5 Documentation Website

    BACnet/IP points reading value timeout issue when configured to periodic update only in the data source

    Scheduled Pinned Locked Moved User help
    16 Posts 2 Posters 3.3k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • phildunlapP Offline
      phildunlap
      last edited by

      Hi cwangv,

      I would definitely not have that many medium and low priority threads, unless you're running on a massive server-class machine. Even then, 300 medium priority and 100 low priority threads is excessive. On machines with 40 cores I usually only go 30/10 on those settings, to provide some context. But, I don't think that's the issue with the COV points.

      One thing that i have noticed is that the 'successful Poll%' always at 100% even while these online/offline issues happening.

      The successful poll percentage refers to aborting polling for usually a timing reason. For instance, if a modbus data source was polling every 2 seconds but it always took 3.5 seconds to poll, it would abort 50% of its polls and show a 50% successful poll rate.

      In order to narrow down the issue, i decided to put all the points back to COV enabled then proceeded to disable the COV setting on a number of them at a time and see when crazy poll duration happened.

      Is this BACnet IP or MSTP ? It would be difficult to diagnose the cause of it slowing down that much - that would probably take a wireshark to see which half of the conversation was pausing. To me, this doesn't sound like it would affect the COV issue either.

      In some previous versions there was an issue with having two BACnet data sources sharing the same local device and doing COV subscriptions. If you can confirm your BACnet module version that could be helpful.

      Can you check your ma.log files from around the time COV subscription stopped working for a message like Subscription failed due to error response for instance at the WARN level?

      cwangvC 1 Reply Last reply Reply Quote 0
      • cwangvC Offline
        cwangv @phildunlap
        last edited by cwangv

        @phildunlap
        Phil
        It is a Bacnet/IP data source. Bacnet Module is 3.6.1. I have just recreated the comms problem (no COV) and have done Wireshark capture.

        Can you let me know how I should submit the wireshark log file?

        Thanks.

        1 Reply Last reply Reply Quote 0
        • phildunlapP Offline
          phildunlap
          last edited by

          You can email it to support@infiniteautomation.com

          cwangvC 2 Replies Last reply Reply Quote 0
          • cwangvC Offline
            cwangv @phildunlap
            last edited by

            @phildunlap
            Hi, Phil
            I have just sent an email to the address you provided. It has got links to two Wireshark captures.
            If you need more info, please let me know.

            thanks.

            1 Reply Last reply Reply Quote 0
            • cwangvC Offline
              cwangv @phildunlap
              last edited by

              @phildunlap
              Hi, Phil
              I am just wondering if the Wireshark captures have been helpful to you guys.
              I can reliably re-produce the issue whenever i set all the points to 'COV' Disabled.
              So you need more capture data or log files from my Mango installation. Please don't hesitate to reach out to me.

              Thanks.

              1 Reply Last reply Reply Quote 0
              • phildunlapP Offline
                phildunlap
                last edited by

                Hi cwangv,

                Thanks for sending in the captures. I did take a look at them and raise the issue, but did not yet resolve the matter. Can you fully describe the replication steps you've found?

                cwangvC 1 Reply Last reply Reply Quote 0
                • cwangvC Offline
                  cwangv @phildunlap
                  last edited by cwangv

                  @phildunlap
                  Hi, Phil
                  A summary of the software configuration:
                  1-Mango Automation Software: Core 3.6.4, Bacnet 3.6.1, Windows 10 Pro 64 Bit.
                  2-Data Source: BACnet IP device with 198 AV points defined, all initially set to 'COV' Enabled and logging to 'All' with polling period of 5 minutes in the data source.

                  Steps to generate the issues:
                  -by setting all 198 AV points to 'COV' Disabled (unchecked), all 198 AV points will go unreliable and lots of 'RTN' events generating for each point. It also appears Mango Automation software will start polling the points in 10 seconds interval as soon as errors occurred.
                  -Sometimes, this issue can be generated if less than a dozen points are left as 'COV' Enabled (checked). Below is a typical event message when Mango is stuck in such state:
                  Note: this happens even when moments ago the polling was working fine for a long period of time. So i dont think it is the physical device problem.

                  0_1568850808192_44f87c8c-76ad-4536-a25e-03c5d593f28c-image.png

                  I have observed previously that the 'COV' functions complete stopped for all the points,ie, no updating even though the actual data source points are changing.

                  I hope the above description is helpful.

                  1 Reply Last reply Reply Quote 0
                  • phildunlapP Offline
                    phildunlap
                    last edited by

                    Okay, it sounds like there are two questions to be investigated, then.

                    First, that the points stop receiving COV notifications after some amount of time working without issue. You captured one of these events in the wiresharks you sent in, yes?

                    The second question concerns why the poll time would just to 48000-ish milliseconds. That number would suggest to me that some of the requests are timing out, and that to poll all those points the requests are being split up. Your Local Device has a timeout of 6000 and a retries of 3, so 48000 would be 8 timeouts, which may or may not all be in retrying one request. If they all did occur in requesting a single set of points, that would produce the event you have a picture of there. It could be that the device is more easily overwhelmed by the large read requests, or perhaps it is generating some kind of log messages about why it isn't responding to certain read requests.

                    cwangvC 1 Reply Last reply Reply Quote 0
                    • cwangvC Offline
                      cwangv @phildunlap
                      last edited by

                      @phildunlap
                      Yes, one of the log files contained the traffic after the COV update stopped between the data source and Mango.
                      One thing to note - that while all these abnormal events happening inside the Mango instance we have, i have another device (Not Mango) polling the same data source via BACnet/IP for the same amount point with no issue at all.

                      Let me know if you need more wireshark captures.

                      Thanks.

                      1 Reply Last reply Reply Quote 0
                      • phildunlapP Offline
                        phildunlap
                        last edited by

                        Hi cwangv,

                        BACnet 3.6.2 was just released which may have a fix for the COV points stopping collecting data after some indeterminate amount of time. It also has a fix for COV points that are set to log on an interval not always storing data.

                        I did not see what would be causing the long poll times you saw at a certain threshhold of points. Maybe a wireshark of that specific event could be illustrative in the timing of the BACnet conversation.

                        cwangvC 2 Replies Last reply Reply Quote 0
                        • cwangvC Offline
                          cwangv @phildunlap
                          last edited by

                          @phildunlap
                          Phil,
                          you guys rock. I will give it a go.
                          Thanks.

                          1 Reply Last reply Reply Quote 0
                          • cwangvC Offline
                            cwangv @phildunlap
                            last edited by

                            @phildunlap
                            I tried again and experienced the same problem with massive timeout issue with every single points.
                            I will send the links for the wireshark captures for your guys to review.
                            At the meantime, i will continue to monitor the interval logging function on points with COV enabled.

                            thanks.

                            1 Reply Last reply Reply Quote 0
                            • phildunlapP Offline
                              phildunlap
                              last edited by phildunlap

                              Thanks for the kind words and the captures!

                              What I notice is that the device at 192.168.69.105 doesn't respond to fragmented readPropertyMultiple requests. Can you try reading the device object's "Segmentation Supported" property to see what sort of segmentation is allowable?

                              I found this article relevant: https://store.chipkin.com/articles/segmentation-in-bacnet

                              We can see in the capture that the device can certainly send segmented messages, but perhaps it can not receive them and the other BACnet client is polling in small requests that do not require segmentation. I would have expected this to have been handled in the code (since it can read the segmentation support off the device object), but what I see in the capture is that it does not reply to segmented requests.

                              Edit: I believe I found in the IAm device 70105 in the captures that it professes to support segmentation in both directions. Hmm.

                              cwangvC 1 Reply Last reply Reply Quote 0
                              • cwangvC Offline
                                cwangv @phildunlap
                                last edited by

                                @phildunlap
                                Phil,
                                On the PICS of device 70105, it states below:
                                0_1570141527173_3fa29a34-0873-49d2-83f9-30015e21cc27-image.png .
                                I then mapped the segmentation property out in Mango, it shows value of '0' which does correspond to supporting both direction of segmentation of messages.
                                0_1570141820740_9639192c-2b67-49b0-8312-085b443a64fb-image.png

                                Would you be able to email me back the captures that shows Mango sending out segmented request but not receiving segmented responses?

                                Then , I will ask our technical support about device 70105 actual capability of supporting 'receive segmented messages'.

                                Thanks.

                                1 Reply Last reply Reply Quote 0
                                • phildunlapP Offline
                                  phildunlap
                                  last edited by

                                  The file called Mango Wireshark between 2019-10-03 1904PM to 2035PM I saw it in. Too large for email, as it was when you sent it!

                                  We can see frame 170715 for instance is sent but no Segment ACK is sent in response to it. I don't know if BACnet4J would wait to send the completing fragment until after receiving the ACK, but that's what appears to be happening. We can then see the six second retries. I'm not sure why the request in question is being sent out twice in rapid succession, though.

                                  1 Reply Last reply Reply Quote 0
                                  • First post
                                    Last post