HTTP Retriever Stops Working [update: all data sources hang not just HTTP]

rcopeland

Hello,

I am using the HTTP Retriever data source, and it works for a while (not always the same amount of time, but around 30 minutes or so) and then it stops updating. If I go to the datasource and click disable, it disables. But then if I click enable, it give me a error message "Timer already cancelled", and on the Tomcat monitor it gives me > WARN 2011-02-11 09:14:13,953 (org.directwebremoting.util.CommonsLoggingOutput.warn:59) - --Erroring: batchId[5] message[java.lang.IllegalStateException: Timer already cancelled.] Any other changes I try to save to the data source gives the "timer already cancelled" error message.

I am thinking that mango must get hung up on something, which prevents further updates from happening, even though I have set my retries to 0 and timeout to 10 seconds.

Extra info: I have tested this on more that one website, and both hang at approximately the same point, indicating that it is not a problem with the website.

Suggestions? Any help is appreciated!

rcopeland

More info: this could possbily be a bug with the current version of mango. All of my data sources stop updating within a minute or so of eachother. Even virtual data sources giving random numbers hang, and product the same "timer already cancelled" error if I try to restart them. I am still not sure what causes this problem to happen, but it seems that 45 mintues or so after mango and tomcat is started, the data sources stop.

I will revert back to a previous version of mango and test this if I have a chance.

Attachment: download link

fmunhoz

What database are you using, Derby or MySQL?

rcopeland

Derby. I haven't changed much of anything in Mango other than some graphical elements.

mlohbihler

Did reverting to a previous version help? The "timer already cancelled" error indicates that the Mango runtime was shut down, which is a curious thing to have happen. It would be interesting to know what the threads are doing just before and just after the points stop updating.

rcopeland

Not sure why I was having these errors, but they seem to have stopped. The only thing I can think of is I use a LOT of meta data sources, some of which have very long functions, take in the value of MANY points, maybe the functions were timing out/crashing the program.

mlohbihler

Is there anything interesting in your logs?

rcopeland

Besides the "timer already cancelled" messages, the "High priority active count: " seems very high, going up to ~500. I have attached the log file containing the majority of these error messages.

Attachment: download link

mlohbihler

Yeah, the timer has a limit of 500 threads. It's an artificial limit, but is usually suitable. Your system is adding more than 10 new high priority tasks every 10 seconds, which is definitely bad (but only if it can't clear them out). I'll go ahead an assume that these are database write behind tasks, which would mean that Derby is insufficient for the scale of your system.