Please Note This forum exists for community support for the Mango product family and the Radix IoT Platform. Although Radix IoT employees participate in this forum from time to time, there is no guarantee of a response to anything posted here, nor can Radix IoT, LLC guarantee the accuracy of any information expressed or conveyed. Specific project questions from customers with active support contracts are asked to send requests to support@radixiot.com.

posted in User help

This is running core 3.3.1 ATM. I see that latest core is 3.3.3 and I will upgrade tomorrow.

posted in User help

So while increasing the size of the ec2 instance and switching to the memory-medium option has allowed us to catch up on the historical points, I am still noticing a significant memory leak. Here are graphs of our system stats over the past 5 days since I started running mango on a t2.large instance.
0_1520484562295_mango-stats-latest..png

We are now working with 8G and the memory-medium option tells java it can use 5G. I have been watching the memory usage steadily climb with periodic jumps once a day around the time when our persistent TCP data sync is scheduled and we get a surge of points.

Why does the memory consistently grow? This made the system unresponsive again for me at a critical time when I had to demo the system for a potential client. Are we doing something wrong here?

Thank you

Adam

posted in User help

Yeah I was just coming to this myself. I wanted to let it run and see how it handled it but I can see that I just need more memory. I'm bumping it up to a t2.large with 8G of memory. It was actually not crashing even though it was at 99% memory usage. But swap was increasing to 50% of the 2G of swap. We'll see how this performs now...

Thanks for your continued help with this.

posted in User help

Good to know. Thank you again. So far java hasn't run out of memory with the memory-medium ext-enabled but I'm also hitting 98% system memory usage and starting to use swap. But the response times are still OK.

I increased my query interval from 10s to 90s. I don't think this is the cause of the issue at all but it won't hurt to hit that endpoint less frequently. I will need to reconfigure telegraf to just grab the metrics I want.

The points waiting to be written are high but are still staying a tad lower than they were before. I think they'll be high as long as mango is catching up on historical point values for awhile. They are peaking around 10k whereas before they were hitting upwards of 15k.

I intend to disable swagger for production but I have been experimenting with it there as I was instrumenting Mango. Thanks for the reminder though.

posted in User help

Awesome! I will check that out. Is there anyway to view the swagger interface for both v1 and v2 without restarting? or can the swagger interface only be enable for one version at a time?

posted in User help

Thanks for the advice! I will reduce how frequently I hit that endpoint and make the query more specific.

posted in User help

I just realized that the memory-small.sh extension was enabled. I'm bumping it up to medium to see if that helps.

posted in User help

The issue persists. I am really not sure what to do here. I can't move forward with any other work while our server is failing every 40 minutes.

I am willing to share some access to our instance of Mango or get on the phone to talk this through if that helps.

This is Adam from iA3 BTW.

posted in User help

The system crashes about every 40 minutes with the update, which appears to be a little longer than it was lasting before. So something was improved by moving to 3.3.1 but not everything... Also I don't start getting errors until the system hits about 69% memory usage.

I just started collecting points waiting to be written. Here is the graph for the last 30 minutes. There were around 15k points before the crash. Now it is down below 100 which is hard to see on the graph in the picture. I just restarted it though so we'll see if it fails again.

I lowered the persistent point value throttle to 5,000,000 as you suggested and I increased the small batch wait time to 20ms and decreased the batch write behind spawn threshold by an order of magnitude to 10,000. I can lower that further if you think it would help but it was previously set at 100,000 so I didn't want to drop it so drastically all at once. I lowered max batch write behind to 6.

What's strange to me is that I don't think I am seeing abnormal io wait % time.

You can see the memory usage hit a cap at 70% and then drop down to around 40% when I restarted mango. Also the usage % spiked as well. The blue line is the user usage %. The system and io usage % is also graphed but it is all well below 10%. Weighted IO Time is maybe a little high but its not spiking with the increased load. so that seems strange. The load average, DB write per second, point value database rate, number of open files, and MangoNoSQL open shards graphs are coming from Mango's /v2/server/system-info endpoint.

posted in User help

Well it is locking up. REST API response times have increased above 2 seconds and are now not responding. And now: Exception .... Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded

How exactly should I go about tuning the MangoNoSQL database?

Should I start with the publishers? There are a number of them and that will require me to login to multiple mango instances to adjust them as I don't have JWT tokens set up for all of them yet.

I'm still watching iotop to see what is going on with mango when it starts to error out.

posted in User help

Wow. I spoke too soon. I just started getting BWB task queue full errors again and memory usage is back up to 70%.

As long as this doesn't crash Mango it can be tolerated but I want to get to the bottom of this and figure out how to avoid it. More to come...

posted in User help

Phil, Terry

Thank you again for your help. Upgrading to 3.3.1 appears to be resolving the issue. Memory usage is not jumping straight back up to around 70% but is a bit below 60% ATM. The errors I was seeing on startup before have stopped so I think this amount of memory usage is probably normal right now while the persistent tcp connections catch up. I'm keeping a close eye on it though.

Terry,
Thanks for taking a closer look. We do indeed have a number of instances of Mango publishing data to this instance, as well as an HTTP publisher running to publish data into InfluxDB. I am going to take a closer look at the IO stats to see if there is a bottle neck there because this isn't the first time we've had BWB task queue errors.

I'm going to get more familiar with the tuning parameters available for the TCP publishers and MangoNoSQL.

posted in User help

Phil

Thanks for your reply. This is currently crashing a production server for one of our customers so your help is greatly appreciated. The problem has escalated to locking mango up within 20-30 minutes after restart. I have also tried rebooting the entire system. I'm going to upgrade mango if it isn't running latest already but interacting with mango through the web interface requires regularly restarting mango.

Here is the debug information you asked for. I hope it helps.

response from /rest/v1/threads?asFile=true&stackDepth=40
https://gist.github.com/8c041b87440fa9b1d39482fdb62b8028

jmap output
https://gist.github.com/095849b00dcf1093098335bbb01226e8

BTW I'm using InfluxData's open source time series database stack to monitor systems and mango. Their web front end tool created the graphs I posted above. Telegraf is a lightweight data collect agent that has numerous plugins. I wrote a Mango HTTP Listener plugin for telegraf to receive live data and parse out the proper timestamp. I will likely publish that plugin soon but it still needs some polish. But the above data I was able to collect with the already existing plugins for querying HTTP endpoints and monitoring system stats.

posted in User help

Hi

I have been monitoring Mango's memory consumption over the past week and we recently had it stop responding on us. This is the second time this has occurred in the past two weeks. It appears that it generally takes about a week for it to get to the point of not responding.

Here are some graphs of the memory consumption and REST and login page response times.

The load average is coming from Mango's internal monitoring and you can see spikes every night at 1am when we have the backup scheduled.

The % memory used is from the system but according to top, java is the biggest memory consumer. As of writing this it is responsible for 48% of memory usage and 195% CPU load. The % memory used started around 45% about two weeks ago and plateaued a bit above 70%. When it became unresponsive to web requests today, I restarted the server, but memory consumption nearly immediately returned to near 70%.

The Local REST API response times and HTTPS ui/login response times are from localhost and go through a locally runny apache proxy which serves up the SSL certs.

We are running this on an Amazon EC2 t2.medium with 4G of memory and 2 CPU cores.

There are a ton of errors in our logs which can be viewed in their entirety here: https://gist.githubusercontent.com/anonymous/bef071adf061ec4e63f0fe2d3b8fa854/raw/540bcaa9aac3abdfd33ab7ff003b24165d622862/mango-logs

I admit I am a tad overwhelmed by the amount of stuff in the logs and I find most of it difficult to understand not knowing the internals of mango. Some errors of note though:

ERROR 2018-02-26T21:15:07,109 (com.infiniteautomation.nosql.MangoNoSqlBatchWriteBehindManager$StatusProvider.scheduleTimeout:731) - 3 BWB Task Failures, first is: Task Queue Full

and

Feb 27 16:20:58 Nusak.NUS.iA3.io ma.sh[358]: Java HotSpot(TM) 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGTERM to handler- the VM may need to be forcibly terminated

BTW we are using oracle java (not openjdk) on Arch Linux.

posted in User help

@Jared-Wiltshire @phildunlap

Thanks again for your help!

posted in User help

Philip

Thanks for your prompt reply. I'm a little confused still though.

"I suspect you are using an RQL endpoint."

I'm querying Mango's REST API V2. The same one that the Swagger interface shows you. Is there some other option here?

"You can put limit(-1) in your query string to have no limit."

Could you be more specific about how to specify the limit in a REST query?

To clarify, with mangocli I am using curl to send a GET request to this endpoint: /rest/v2/data-points

I tried adding it as a url param ``https://nus.ia3.io/rest/v2/data-points?limit=-1` but that fails with 500.

Attempting to put it in the body of a GET request also fails.

Edit: I just looked back at the swagger interface and saw the note "Use RQL formatted query"
I'm looking into RQL. I'm not yet familiar with it.

Edit 2: After reading up on RQL, I played around with my query and got the following URI to work: /rest/v2/data-points?limit(400). Unfortunately limit(-1) causes a SQL error with MariaDB. I get back the following

{
  "message": "You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '-1' at line 1",
  "stackTrace": "sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)\nsun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
...

Thank you for pointing me in the right direction!
Where can I read up on what RQL mango accepts other than reading the source code directly?

posted in User help

I'm running the latest version of Mango with all modules up to date.

When I perform a GET request to the data-points end point I get back a JSON with two top level fields: "total" and "items". The "total" value is 334, however the number of items in the "items" array however is always only 100.

For example using my curl wrapper called mangocli:

$ mangocli data-points | head -n5
{
  "total": 334,
  "items": [
    {
      "id": 1,
$ mangocli data-points | jq '.items | length'
100

This is true for both the v1 and v2 APIs and when I use the swagger interface and curl to make the requests.

The user account that the JWT token is associated with has full admin privileges in Mango.

It seems like some sort of pagination but I can't find any mention of this behavior in the forum or in swagger. I also tried adding a query parameter for page but that resulted in a 500 error.

adamlevy

Hi all,

Since Mango 3.3 introduces the use of JWT tokens for authentication, I decided that using the REST API is a bit more appealing. So I decided to write a little utility to make it easier to query.

mangocli is written in Bash and uses (https://curl.haxx.se/) and jq.

You can review the code and download it here: https://github.com/iA3io/mangocli

This is an open source tool and you are free to use and modify it however you see fit. I hope you find it helpful.

More information is in the README in the Git repo but here is a snippet:

Example Usage

Query all data points

$ mangocli -a ./jwt-token data-points

The default HTTP action is GET. The default host is http://localhost:8080.

Create a new data point

$ mangocli -a ./jwt-token -d @./new-point.json data-points

The -d option is just the --data option for curl. The @ prefix causes
the argument to be parsed as a file. Using the -d option changes the default
HTTP action to POST.

Enable and restart a data point

$ mangocli -a ./jwt-token -u enabled=true -u restart=true PUT data-points/enable-disable/DP_XID

Using the -u option defines a URL query parameter.

Delete a data point.

$ mangocli -a ./jwt-token DELETE data-points/DP_XID

Manual

Usage: mangocli [OPTION...] [ACTION] PATH

Options:
  -h HOST      HOST is the domain name or IP address of the Mango server.
               Can be set using the environment variable MANGOCLI_HOST.
               Defaults to 'http://localhost:8080' if not specified.
               A HOST without a specified protocol will default to HTTPS.

  -a AUTH      AUTH is the JWT token or the path to a file containing it.
               Can be set using the environment variable MANGOCLI_AUTH

  -d DATA      DATA to include in the body of a request. Can be specified multiple times.
               Use `-d @-' to read from stdin or `-d @filename' to read from a file.
               Equivalent to `curl --data="DATA"'. Changes default ACTION to POST.

  -u PARAM     PARAM is URL encoded data. Can be specified multiple times.
               Using this option will pass all data as URL parameters.
               Use `-u @-' to read from stdin or `-u @file' to read from a file.
               Equivalent to `curl --get --data-urlencode="PARAM"'.

  -C CURL_OPTS CURL_OPTS is a space separated list of additional options to pass to curl. 
               Can be specified multiple times.

  -V VERSION   VERSION is the REST API version number: 1 or 2.
               Can be set using the environment variable MANGOCLI_VERSION.
               Defaults to v2.

  -c           Compact JSON output. Removes all whitespace.

  -v           Increase verbosity. Once for info, twice for debug output.

  -q           Quiet mode. Suppresses all error, info and debug messages. 
               Overrides any `-v' options. 

Arguments:
  ACTION       The HTTP action: GET, POST, PUT, DELETE
               Defaults to GET.

  PATH         The path or route for the request

adamlevy

This worked for me. Thank you!

BTW we had rest.customDateOutputFormat already specified in our env. properties file, it was just the wrong format. So the hard coded defaults are probably fine.

adamlevy

Running the latest version of Mango with all modules up to date. When I go to generate a JWT token for any user I get the error message "Failed to create authentication token: Bad Request"

Here is the corresponding error output from the logs: https://pastebin.com/XF3CbaLM

The first line is:

ERROR 2018-01-29T13:26:50,292 (com.serotonin.m2m2.web.mvc.rest.v1.exception.RestExceptionHandler.handleExceptionInternal:74) - Cannot deserialize value of type `java.util.Date` from String "2018-02-05T22:26:48.933Z": not a valid representation (error: Failed to parse Date value '2018-02-05T22:26:48.933Z': Unparseable date: "2018-02-05T22:26:48.933Z")

We are running this version of java:

$ java -version
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

Please let me know what additional information I can provide.