Scraping an entire data set
-
I was wondering if someone wouldn't mind pointing me in the right direction, the first step of which would be to confirm this is even possible.
I would like to be able to scrape an entire set of data points via XML and then be able to plot those in an excel report. The data I will be pulling in is hourly predicted consumption rates that some energy entity posts. They post an entire week of predictions at a time and I'd like to be able to pull those into Mango, analyze each reading to see if it crosses some set value, and then plot the entire data set in a report.
I'm not asking for a solution to the problem but rather some ideas on how I would be able to store that data set. Alternatively I'm thinking that instead of needing to plot it I could just read in the points one by one, compare to my threshold and then just report about what time and day that threshold was exceeded.
Thank you forum!
-
Hi psysak,
It sounds like a task for the Data File data source. That's the primary tool for an arbitrary XML task. You can generate the parsing class after generating an .xsd for this XML. Then you have to write a class that implements the function to get data out of the parsed XML. You could fetch the file with either a wget on a cron or write a poll class to load into the Data File data source to fetch the data.
Alternatively, you could use the HttpBuilder utility in a script. On a cron, your script does something like...
HttpBuilder.get("sptth://api.extension/demand-predictions", { /*object of headers*/ }, { /*object of parameters*/ }).err( function(status, headers, content) { throw "Got non-200 response: " + status; }).resp(function(status, headers, content) { print(content); //Okay, now we have to parse the XML. You can do regex /(pattern)/.exec(content)[1] // Or you can do it to any degree of parsing (i.e. indexOf --> substr, new java.io.StringReader(content), etc) }).execute();
-
Thanks Phil! Super helpful as always.
That HttpBuilder function, what language is that part of? I ask because I'd like to see the documentation for it. Learning on the fly here :)
-
The HttpBuilder is a utility specific to Mango. You can find some notes about it in the "Mango JavaScript" help dialogue within Mango. It's linked to by the contextual help for all things that use scripts in the 'related items' section. Probably github is the only source that's always at least up to date for this file: https://github.com/infiniteautomation/ma-core-public/blob/main/Core/web/WEB-INF/dox/mangoJavaScript.htm
-
Just FYI, I can't find any mention of HttpBuilder in that Mango Scripting help.
Last question... for today :) While I'm on the topic, how about JSON.parse documentation?
-
I think I may have got it, I found some stuff in that Rhino syntax help :)
-
Nice! Yeah, it's a builtin object for JavaScript, generally. Just an FYI but the references to Rhino are dated. Mango 3 only runs on Java 8 which uses Nashorn as the engine. Most things should be very similar, but it may matter at some point.
-
OK thanks
-
Interesting, the context help in the mango jumps from RQL Examples to Logging and totally misses the HttpBuilder and Context sections. Thanks for the GIT link
-
MDN is a very good reference for all things JavaScript
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/parse -
Thanks @Jared-Wiltshire For anyone else possibly reading this, one of the issues I ran into there was the versions of the libraries, for example the RegExp command. The way it's documented there I don't think makes the Mango very happy
-
@psysak MDN will definitely contain references to a lot of new JavaScript features which aren't available in the Java 8 Nashorn engine which we leverage. Nashorn should be compliant with ECMAScript 5.1. The MDN pages tell you which version of ECMAScript (JavaScript) the feature is supported in. e.g. look under specifications on the JSON.parse() page, it says
ECMAScript 5.1 (ECMA-262)
The definition of 'JSON.parse' in that specification. Standard Initial definition. Implemented in JavaScript 1.7."