I have set up some update events on some meta points. I have set up the Update Event for teh beginning of the day and it is grabbing the AVg value. Problem is... sometimes it works.. Sometimes it doesn't. Has anybody else had an issue where the update event is seeming to not work?
Update Event not Working
What version of Mango are you using?
catkins, what do you mean by "not Working?" Does the time stamp update but not the value, or or do both the time stamp and the value stop updating, or is it another problem?
I have been having problems with Mango metadata points not updating at least since I first upgraded to version 2.x.x. Usually the symptom was that both the value and the timestamp would simply no longer update. This occurs with points set to update on context update, and with points set to update at "start of minute." It may occur with other update settings too.
Often this would occur when "Medium priority work items" ballooned to 10's of thousands. After recently upgrading to core 2.7.4 build 468 and removing all patches, I increased my "Medium priority core pool size" from 3 to 300. So far my medium priority work items have not ballooned, although in the past few months, weeks may go by without that occurring anyways. Unfortunately, when medium priority work items are high or metadata stop updating, we can't use the event detectors to send an email because their priority is the same or lower than the metadata point.
This past weekend I saw that some metadata points had out-of-date values. Further aggravating the problem was the fact that their time stamps were up to date. This occurred with at least three different metadata points. Today I noticed the same symptom with another metadata point that was set to update at "start of minute."
My usual work around has been to disable then immediately enable the metadata point, after which the value updates on its normal event schedule. Saving the metadata point with no changes has the same effect. The problem is that I have hundreds of metadata points: too many to detect manually, and in any case, the defect could occur in the middle of the night when no one is watching.
In earlier versions of Mango, at least we could use the time stamp as a good indicator of out-of-date calculations, but it seems that in the current release we cannot. This means that even if the medium priority work items are very low or zero, we cannot use the "No update" detector to alarm the frozen metadata. Even if we could, it would be quite tedious to add that alarm to all my points.
Were any changes made to metadata or event handling between core 2.7.4 and 2.7.6?
Until these issues are fixed, Metadata cannot be relied upon. Consequently, fixing this should be given a high priority.
Can you check your logs around the time you observed the value stopped behaving as you expected? What is the point's logging setting? As I understand the code's activity, if your timestamp was updating and you are not set to interval logging, the Meta point's script was run and either returned that value or failed to run due to a point in its context missing or some such. That would generate events. The timestamp would not update, though.
To answer your question, the significant changes were not to the event handling that would produce what you describe. The ballooning medium work items was probably caused by a highly contested lock, and all we did was improve the blocking logic. Event notifications have always and still do have their own threadpool which would be responsible for executing a meta point for a context update.
Phil, I see that the metadata point value stopped changing around 9am, even though its time stamp continued updating. I did not notice at the time because that value changes very slowly under normal circumstances. That metadata point function uses only one context point, from a Modbus datasource. It is set to log on change with a tolerance of "1," but its actual value should have been well in excess of that.
Upon reviewing the logs, it seems that in this case the metadata point value stopped updating when the modbus datasource in the context was shut down (understandably). However, when the data source enabled, apparently the metadata point did not resume calculation. The real-time value was frozen, but its time stamp continued to update.
Other than aborted polls and denied access security warnings, the logs around that time are:
2016-03-06 08:50:52,257 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@3ac07e17 rejected because Task Currently Running
That must be new; I have not seen it in the logs before. Also:
Unfortunately that log entry cites neither the metadata function that failed nor its "point that is disabled or does not exist." I have too many metadata scripts to check individually. I look forward to ma-core github issue #632 addressing this. It may be the same point, but could also be any of a number of other metadata points.
Nonetheless, the error above should not update the metadata point timestamp, and the issue here is that the timestamp updates but the point value does not. Additionally, the point that was not updating resumed working days later when I saved it with no changes, indicating that saving the point probably triggered an internal "point unreliable" flag to change state. There have been issues with the "point unreliable" tag before. Perhaps you should consider rendering that value to the user, or at least logging when point reliability states change.
2016-03-06 09:09:48,232 INFO (com.serotonin.m2m2.rt.RuntimeManager.stopDataSource:375) - Data source 'AcuDC' stopped
The above datasource is in fact used by the metadata point that stopped calculating.
I was logged in at the time, I don't remember stopping that Modbus Data source, but do not rule it out.
The Modbus source was later restarted:
2016-03-06 09:09:49,588 INFO (com.serotonin.m2m2.rt.RuntimeManager.initializeDataSource:302) - Data source 'AcuDC' initialized 2016-03-06 09:09:49,588 INFO (com.serotonin.m2m2.rt.RuntimeManager.initializeDataSource:307) - Data source 'AcuDC' took 1329.476193ms to start
However, after the Modbus source restarted the metadata point value no longer changed, but the metadata time stamp did change.
I don't know whether the log below is related:
2016-03-06 09:32:58,628 FATAL (com.serotonin.m2m2.util.timeout.RejectedHighPriorityTaskEventGenerator.rejected:27) - High priority task: com.serotonin.m2m2.util.timeout.TimeoutTask@24fd95ca rejected because Task Currently Running
In any case, the following day, the point values were still unchanged even though their time stamps were current and the context point had only been disabled momentarily. Do you want me to email you the unfiltered log?
Note that the real-time value was displaying the old number. The point value displayed in a graphical view Simple Point or Watch List should display the latest value regardless of how the value logging is configured.
Thank you for making such a thoughtful investigation! I am not sure if it is the cause of your specific issue or not, but in testing this I did discover a repeatable condition where 1) the meta point was updating, 2) its value was wrong, 3) it was not marked unreliable. I am sure we will address what I found promptly.
Can you tell me a little more about the meta point? i wonder, 1) How many points are in the context, 2) How many update the context, 3) What is the script?
If you do not wish to answer here, emailing support with the JSON for the point could be useful.
This applies to you too, catkins, if you would be willing to share. Edit: Re-reading, I see you mentioned your meta point was running on a cron, so nevermind, that is probably enough information. I am curious how many points are in the context, though.
This is resolved in the newest Meta module, and we implemented your feature request for providing more information concerning which context points are disabled, which would require getting the latest Core.
Thanks. I am looking forward to the update.