Data Crawler for CDO (Climate Data Online) web services
npm install climate-data-crawler
NCDC's CDO web services offer current- and historical climatic data from data sets with data
from the US and the rest of the world. Using a basic REST client to query these web services is fine as long as you only
need some data, but when you need to query a lot of data or many locations, it quickly becomes a time consuming and
challenging task. This is where Climate Data Crawler helps; It allows you to easily setup queries and to run queries for
multiple locations.
GHCND or GHCNDMS) and datatype (for instance MMNT (monthly mean minimum temperature))It also tags all data records with the corresponding locationId to make it easier to create aggregated results
based on locationIds and not only stationIds.
You also need node.js and its package manager (npm) installed. You can download node.js (including npm) from: http://nodejs.org/.
Clone the Climate Data Crawler repository using git:
```
git clone https://github.com/jonbern/climate-data-crawler.git
cd climate-data-crawler
Install npm dependencies by running the command below:
``
npm install
node getLocations.js
`
This will retrieve all locations classified as cities and store the result in a file called CITIES.json.
You can edit the query used in getLocations.js to query a different set of locations if desired.$3
The example below will get the most recent data for the 100 first locations in CITIES.json using 2010 as
data probing stop year.
`
node app.js --dataset GHCNDMS --datatype MNTM --locations 'CITIES.json' --probingStopYear 2010 --offset 0 --count 100
`
The example above makes the assumption you have a 'CITIES.json' file in your climate-data-crawler directory.Results will be stored in ./data.
npm package
You can also install Climate Data Crawler as a npm package.`
npm install climate-data-crawler --save
`
This is particularly useful if you need to incorporate Climate Data Crawler into your own project, for instance if you want to build
custom crawling strategies built on top of CdoDataProbingQuery or using CdoApiClient to create custom queries.Usage
$3
CLI:
`
node app.js --dataset GHCNDMS --datatype MNTM --locations 'CITIES.json' --probingStopYear 2010 --offset 0 --count 100
`
This will get the most recent MNTM data for the 100 first locations in CITIES.json using 2010 as data probing stop year.Using the CLI, results will automatically be stored to disk (./data folder).
JS:
`
var fs = require('fs');
var CdoDataCrawler = require('climate-data-crawler/cdoDataCrawler');
var DataProbingBounds = require('climate-data-crawler/dataProbingBounds');var dataset = 'GHCNDMS'; // Global Historical Climatology Network-Monthly
var datatype = 'MNTM'; // monthly mean temperature
var locations = JSON.parse(fs.readFileSync('CITIES.json', 'utf8')); // locations to query
var dataProbingStopYear = 2010; // data probing stop year
var dataProbingBounds = new DataProbingBounds(dataProbingStopYear); // data probing bounds algorithm
var crawler = CdoDataCrawler.createInstance(dataset, datatype, locations, dataProbingBounds, 0, 100);
crawler.run(function(results, locationsNoData){
// do something with the results and log which locations returned no data
});
`$3
This example will query the Brisbane location for the most recent monthly mean temperatures between 2014 and 2010:
`
var cdoDataQueryFactory = require('climate-data-crawler/cdoDataProbingQuery');var startYear = 2014;
var stopYear = 2010;
var dataQuery = CdoDataProbingQuery.createInstance('CITY:AS000002', 'GHCNDMS', 'MNTM', startYear, stopYear);
dataQuery.run(function(queryResult){
console.log(queryResult);
});
`$3
Get Brisbane's monthly mean temperatures between 01 January 2014 and 31 December 2014:
`
var CdoApiClient = require('climate-data-crawler/cdoApiClient');var locationId = 'CITY:AS000002';
var dataset = 'GHCNDMS';
var startDate = '2014-01-01';
var endDate = '2014-12-31';
var datatypeid = 'MNTM';
var queryPath = '/cdo-web/api/v2/data?datasetid=' + dataset
+ '&locationid=' + locationId
+ '&startdate=' + startDate
+ '&enddate=' + endDate
+ '&datatypeid=' + datatypeid
+ '&limit=1000';
var client = CdoApiClient.createInstance(queryPath);
client.query(function(result){
console.log(result);
});
`Retrieve details of all registered stations:
`
var fs = require('fs');
var CdoApiClient = require('climate-data-crawler/cdoApiClient');var queryPath = '/cdo-web/api/v2/stations?limit=1000';
var client = CdoApiClient.createInstance(queryPath);
client.query(function(result){
console.log(result);
fs.appendFileSync('./stations.json', JSON.stringify(result) + '\r\n');
});
`
Error handling
All three components have support for defining an error callback to handle errors.
`
...var errorCallback = function(error){
// your error handling here
}
crawler.run(successCallback, errorCallback);
``* PRCP - Precipitation (tenths of mm)
GHCNDMS - Global Historical Climatology Network-Monthly Summaries data set:
* MNTM - Monthly mean temperature
* MMNT - Monthly Mean minimum temperature
* MMXT - Monthly Mean maximum temperature
* TPCP - Total precipitation
NCDC's (National Climatic Data Center) CDO (Climate Data Online) web services v2
Wikipedia: Global Historical Climatology Network
GHCND Global Historical Climatology Network)-Monthly Summaries documentation