Mango does not start

gbodacs

I set the "fix db corruption problems" flag in the property file and after 40 minutes of loading, the Mango finally started!
I mean 40 minutes after the startup sequence, after I can log in.
The db file size is ~2 GByte.

I did not waited 40 minutes before the flag set, so I'm not sure the db was in a corrupted state...

I still think the root cause of the problem is the Windows update.

phildunlap

Hi gbodacs,

Have you been watching the log when these issues occur? A corrupted H2 database would very likely produce log messages. If you H2 database is corrupted, you can move your Mango/databases directory and try a database backup restore (from the system settings page). If everything goes well, you would need to move your Mango/moved-databases/mangoTSDB into the new Mango/databases directory to keep your old point data.

Have you used any ext-enabled scripts to allocate memory to Mango? It sounds like it may be a memory issue. If you haven't, you may wish to move a memory.bat script from Mango/bin/ext-available to Mango/bin/ext-enabled and set it to 2 GB or more. You could also try purging your events and userEvents tables in the Purge Settings section of the system settings.

The 40 minute startup may have been fixing corruption in the NoSQL database (perhaps you closed it once by simply hitting the X in the console, which is a hard kill in Windows), which should only have affected your system if it were running low on memory (NoSQL corruption can be fixed while Mango is running, if the resources are available). I would guess this is incidental, but that you had corruption from closing the Window at some point.

Ctrl+C in a command prompt is the correct way to shutdown a Mango in a command prompt. You should wait for it to prompt if you really wish to exit the program.

gbodacs

Hi Phil,

Thank you for the fast response!
After running the "db corruption check" the Mango works like a charm! I'm pretty sure I did not close the console window with the X, but I found a timeout in the configuration file:
runtime.shutdown.medLowTimeout=60
runtime.shutdown.highTimeout=60

What happens if I start the shutdown process and one minute is not enough? Could this make to corrupt the database?

In the past I had lots of "high priority task timeout exception, thread pool is full (100)". So I modified the 100 to 200. Could the small thread pool make corruption to the database?

The log files in the log folder are always 0 byte long, where should I look for the notification about the corrupted database?

Thank is advance,
gbodacs

gbodacs

Well, it happend again. I was watching the data points in the Mango, and the database halted. The Mango webserver working correctly, but no data on the webpages (eg. settings) or the page does not load (Graphic views) correctly, just the loading animation of the browser shows.

The problem is not connected to the Mango start (because the Mango instance was working for more than a day correctly). I just realized, that the CPU usage is very high in this status (30-60%) and the hard disk is working hard on the database file:Mango\databases\mah2.h2.db (it's not on the screenshot)!

The console log is "empty", no exception, no error messages. The log file in the log folder are 0 byte long.

Last time it takes 40 minutes to relive from this half-dead status.

This Mango instance controls 13 government heating centers for schools, kindergardens, hospitals, etc! Any thoughts how to fix this?

Thanks!

gbodacs

I have found the thread what causes the problem:

Please help to fix this issue!

terrypacker

@gbodacs

I would like to know the full stack trace from that thread. Since the name starts with qtp it is a thread from the Jetty pool which means it is triggered from UI interaction. From the screenshot I can see that it is something accessing the database but the lower part of the trace is missing.

gbodacs

Hi TerryPacker!

It is happening more than eight hours now. The CPU usage and the hard disk usage is still very high.
The situation changed in a bit, now I cannot log in into the webpage. The login page is loaded, but after I type the creds and press the Login button, the webpage is loading endlessly.

Is there a way to get this information outside of the Mango webpage?

Thank you,
Gbodacs

gbodacs

I had to restart the Mango instance, because the login was not working. After pressing the ctrl+c, the Mango logged the line (pressed the ctrl+c and started shutdown procedure), but nothing happens after 5 minutes of waiting, so I killed the process.
I started it again, and now I can log in and use the webpages. The CPU and HDD usage is high again and the features (Graphics View, SQL, Settings, etc) are not working, so it is still in that status.
Luckily, the threads in the log view works, so I made a screenshot about the suspicious thread:

Of course, this was not triggered from the UI in this case.

phildunlap

Hi gbodacs,

You can download thread dumps by navigating to http://[ip:port]/rest/v1/threads?stackDepth=50&asFile=true

This will contain the information more completely.

gbodacs

Hi Phil!

I tried to get the thread-list, but the Mango did not respond. In that state (H2 database file recover?) the rest API was not responding. I tried to restart the Magno multiple times, but it did not solved the issue, just the database file gets bigger and bigger.

I also tried to move the full Mango with the problematic database file to ramdrive to speed up the recovery, and it finishes in 10 minutes, but after 2-3 Mango restart, the recovery started again.

Finally, I decided to try to restore the database from the latest automatic backup file, and this looks like solved the issue in the last 11 hours. I will monitor closely the server in the next 2-3 days to make sure this is the solution for this problem.

I'm worried about that I don't know what was the root cause of the H2 database problem, do you have a list that I must not do with the Mango to avoid this behavior in the future?

Thank you,
gbodacs

phildunlap

Glad you got it running again!

No, there is no specific list of things that will send your H2 database toward trouble. Usually someone's events table is very large, but you didn't respond to the notes about purging your events, userEvents and audit tables. Similarly, rereading, the "db corruption check" that at one point you said helped doesn't do anything to the H2 database, it's a NoSQL setting. You did not respond to my line of questioning about memory in an ext-enabled script. It could be that you're running out of memory.

You reported your log file was empty, but this is almost certainly not so. Can you check the ma.log file (some will have dates prepended) in either Mango/logs or Mango/bin/logs if the issue returns? You may also want to search the existing log files for any memory errors.