• Recent
    • Tags
    • Popular
    • Register
    • Login

    Please Note This forum exists for community support for the Mango product family and the Radix IoT Platform. Although Radix IoT employees participate in this forum from time to time, there is no guarantee of a response to anything posted here, nor can Radix IoT, LLC guarantee the accuracy of any information expressed or conveyed. Specific project questions from customers with active support contracts are asked to send requests to support@radixiot.com.

    Radix IoT Website Mango 3 Documentation Website Mango 4 Documentation Website Mango 5 Documentation Website

    Alarms due to high priority task already running

    User help
    4
    17
    5.7k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P
      Pedro
      last edited by

      Terry,

      Like you, I was also surprised that a virtual data source would abort polls considering it does not have much work to do. I dumped the threads to a 1MB JSON file. This was easy as this FATAL error occurs frequently. I will email it to support@...., along with the JSON for the points of the Settings data source and a snippet of ma.log. Unfortunately the high number of alarms this error generated has triggered a spam threshold, which caused my email provider to disable outgoing emails from my domain. So I must wait for the spam limiter to reset before I can send the email.

      The Settings datasource is used only to hold constants. Is it best practice to hold constants in a NO_CHANGE virtual datasource, or in a metadata point as: return my.value; set to update once per year?

      1 Reply Last reply Reply Quote 0
      • terrypackerT
        terrypacker
        last edited by

        Pedro,

        It all depends on what you are using the constants for. If they are only used in a Meta or Scripting data source then a Global Script constant would be best. If you need them inside a data point or the data base then a low period polling Virtual data source would be best. I would think running a script to store them in a data point would be the least ideal option as the Javascript engine needs to be run up and execute the script where a No Change virtual point literally does nothing when it polls besides spawning the poll thread.

        1 Reply Last reply Reply Quote 0
        • P
          Pedro
          last edited by

          Thanks. The points are used by meta datasource functions, and displayed in a graphical view where certain users can easily change their values by directly entering a new point value. Perhaps it would be more accurate to call them "user-edited "threshold values," i.e. "variables that seldom change" rather than calling them constants. Some are changed by a metadata function, but that is the exception.

          I probably have well over one hundred javascript metadata functions. I don't want the settings in a global script because they would not be user-editable, nor would their history be logged. From what you state, it sounds like I should move those seldom-changing variables from meta to a NO_CHANGE virtual source. However, if I do that, I will lose my point history unless I roll up my sleeves and issue some commands from Mango's mySQL console, which I'd rather avoid unless I can expect a noticeable impact on performance.

          1 Reply Last reply Reply Quote 0
          • terrypackerT
            terrypacker
            last edited by

            Pedro,

            A few questions about performance.

            1. From what you said about SQL console it appears you are storing all your point value data in MySQL, is that correct?

            2. Are you running on Java 7 or 8? I benchmarked the latest Java 8's Nashorn javascript engine against the old Mozilla engine and found it to be almost 2x as fast when running Mango's data source scripts.

            P 3 Replies Last reply Reply Quote 0
            • P
              Pedro
              last edited by

              I'm storing pointValues in TSDB. However, if I want to retain the point history I presume I would have to reassign the point using SQL, since I cannot move points from one datasource to another. It would be very nice if I could do that, as there are a number of metadata points that I would like to move from one metadata source to another without having to get dirty with SQL, which I find risky.

              I'm running Java 8 under Ubuntu:

              $ java -version
              java version "1.8.0_45"
              Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
              Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
              
              
              1 Reply Last reply Reply Quote 0
              • P
                Pedro @terrypacker
                last edited by Pedro

                @terrypacker My email hosting provider finally unblocked my outgoing emails, which had been blocked because it was also hosting Mango's alarm emails. So I just emailed the requested data to the support email address.

                Since Mango was even slower today, I finally restarted it, taking the opportunity to upgrade the API, dashboard, and report modules.

                The % idle went up significantly today:
                0_1462484114960_CPU-idle-1week.png
                And the CPU usage (100-idle%) went down accordingly:0_1462484034278_CPU-utilization-1day.png

                My working theories are that either:

                • There is a memory leak that led to excessive disk swapping
                • Some error messages that used to only show up in ma.log also show up in /events.shtm. The combination of writing errors to SQL and text to ma.log may have caused Mango to fall behind on its tasks, to the point where it could not keep up unless it cancelled high priority taks
                • An intermittent task either continually raised the CPU utilization, and was killed during the restart.

                Of the graphs below, the most interesting ones to me are the block writes, interrupts per second, swap memory, and heap memory. Labels are above the graphs. I think the block writes growing may be the ma.log writing rate increasing due to a positive feedback loop. What explains JVM-max-available-memory?

                CPU-IOwait
                0_1462485630884_CPU-IOwait.png
                free-memory
                0_1462485653691_free-memory.png
                interrupts-per-second
                0_1462485670963_interrupts-per-second.png
                jvm-heap-memory
                0_1462485731385_jvm-heap-memory.png
                JVM-max-available-memory
                0_1462485841215_JVM-max-available-memory.png
                swap-memory
                0_1462485888291_swap-memory.png
                scheduled-work-items
                0_1462485863818_scheduled-work-items.png

                Thanks.

                1 Reply Last reply Reply Quote 0
                • P
                  Pedro @terrypacker
                  last edited by

                  @terrypacker There may be a way to reproduce this issue, if it is caused by a high CPU load: since recent Mango versions take more CPU to render graphical views, if enough users or web browsers display a complex graphical view, the CPU idle time will fall too low, task timeouts will occur, and errors will start filling the event list and ma.log. Additional CPU will be utilized to send event emails.

                  The additional load will likely put Mango into a sustained mode where the tasks cannot meet with real-time, and the task manager will start shedding tasks and generating more error messages. At that point, reducing the page views will not lower the CPU load sufficiently to dig itself out of this vicious cycle, and servicing those very errors will sustain the high CPU load.

                  Obviously I do not want to cause this error on my live system, but I offer the following methods of preventing, assessing, and mitigating these issues:

                  Assessment:

                  • Add a profiler to display to the admins how much time each module is using. A core module time per page rendered would also be useful.
                  • Add a "Runtime status" section to report poll times of metadata sources, or of metadata points. If it can be resource intensive, pass it as a command switch for troubleshooting.

                  Prevention (or reduction):

                  • Use the above time profiling data to determine what is causing most of the CPU load, so either developers or users can determine what part of their configuration consumes the most CPU, so they know where to make optimizations.
                  • Optimize the graphical view renderer for better performance
                  • Does Mango log in/out of the POP server after each email it sends? If so, increase email notification performance by not logging out.

                  Mitigation: once mango is in this vicious cycle of reporting errors, CPU load must be reduced to break the cycle by reducing the CPU error-servicing load:

                  • Allow the administrator to temporarily raise the bar on servicing events, such as:
                    • Suspend reporting timeout errors to the event database
                    • Suspend reporting timeout errors to ma.log
                    • Raise ma.log reporting level to not report WARN
                    • Raise the bar on outgoing event emails (Information, Urgent, Critical)
                    • Raise the bar on logging to the event database (Information, Urgent, Critical)
                  • Temporarily flush either the low, medium, or high priority job queue
                  • Once the system is caught up, these measures should be put back to normal.

                  Perhaps this could be thought of as a soft reset rather than a hard restart.

                  1 Reply Last reply Reply Quote 0
                  • P
                    Pedro @terrypacker
                    last edited by

                    @terrypacker Since the restart the CPU idle time has been around 70% and graphical view simplepoints have been updating much faster. However, high priority tasks are still being "rejected because Task Currently Running:"

                    2016-05-08 23:28:36,733 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@39de09ff rejected because Task Currently Running 
                    2016-05-08 23:47:59,172 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@775e67cb rejected because Task Currently Running 
                    2016-05-08 23:47:59,327 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@39de09ff rejected because Task Currently Running 
                    2016-05-09 00:07:11,462 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@1c7b07ad rejected because Task Currently Running 
                    2016-05-09 00:26:35,707 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@30133195 rejected because Task Currently Running 
                    2016-05-09 00:26:35,719 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@588b748a rejected because Task Currently Running 
                    2016-05-09 00:45:48,775 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@1260f118 rejected because Task Currently Running 
                    2016-05-09 01:05:18,388 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@2795a94 rejected because Task Currently Running 
                    2016-05-09 01:24:34,660 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@3dc97162 rejected because Task Currently Running 
                    2016-05-09 01:43:50,236 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@4b00d126 rejected because Task Currently Running 
                    2016-05-09 02:03:05,363 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@fec7ec6 rejected because Task Currently Running 
                    2016-05-09 02:22:22,817 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@39de09ff rejected because Task Currently Running 
                    2016-05-09 02:41:36,815 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@3dc97162 rejected because Task Currently Running 
                    2016-05-09 03:00:53,652 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@39de09ff rejected because Task Currently Running 
                    2016-05-09 03:20:15,385 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@1260f118 rejected because Task Currently Running 
                    2016-05-09 03:39:32,514 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@588b748a rejected because Task Currently Running 
                    2016-05-09 03:58:53,301 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@4b00d126 rejected because Task Currently Running 
                    2016-05-09 04:18:13,426 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@3dc97162 rejected because Task Currently Running 
                    2016-05-09 04:37:13,392 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@30133195 rejected because Task Currently Running 
                    2016-05-09 04:56:29,434 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@2795a94 rejected because Task Currently Running 
                    2016-05-09 05:16:03,176 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@2795a94 rejected because Task Currently Running 
                    2016-05-09 05:56:16,256 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@4b00d126 rejected because Task Currently Running 
                    2016-05-09 06:16:13,354 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@b082053 rejected because Task Currently Running 
                    2016-05-09 06:35:48,535 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@2795a94 rejected because Task Currently Running 
                    2016-05-09 06:55:42,575 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@4b00d126 rejected because Task Currently Running 
                    2016-05-09 07:15:19,662 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@fec7ec6 rejected because Task Currently Running 
                    2016-05-09 07:15:19,947 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@79cb4799 rejected because Task Currently Running 
                    2016-05-09 07:35:53,464 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@30133195 rejected because Task Currently Running 
                    2016-05-09 07:55:40,366 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@2795a94 rejected because Task Currently Running 
                    2016-05-09 08:16:35,096 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@fec7ec6 rejected because Task Currently Running 
                    2016-05-09 08:37:16,287 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@30133195 rejected because Task Currently Running 
                    2016-05-09 09:17:54,381 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@33bf3c4c rejected because Task Currently Running 
                    
                    
                    1 Reply Last reply Reply Quote 0
                    • terrypackerT
                      terrypacker
                      last edited by

                      @Pedro thanks for the information,

                      I was unable to come up with a solution for your system with the time I had allotted. In the next few weeks (for the 2.8.0 release) we will be making changes to the code in these areas. Once we re-open development I will be able to take a closer look and set up some simulations to see where we can make improvements.

                      I will be in touch with any questions or requests for more information as soon as we begin 2.8.0 development.

                      Thanks,
                      Terry

                      P 1 Reply Last reply Reply Quote 0
                      • P
                        Pedro @terrypacker
                        last edited by

                        @terrypacker Thanks for the update. Please keep in mind that this problem is not unique to my installation.

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post