Title :
PAIRS: A scalable geo-spatial data analytics platform
Author :
Levente J Klein;Fernando J Marianno;Conrad M Albrecht;Marcus Freitag;Siyuan Lu;Nigel Hinds;Xiaoyan Shao;Sergio Bermudez Rodriguez;Hendrik F Hamann
Author_Institution :
IBM TJ Watson Research Center Yorktown Heights, NY 10598
Abstract :
Geospatial data volume exceeds hundreds of Petabytes and is increasing exponentially mainly driven by images/videos/data generated by mobile devices and high resolution imaging systems. Fast data discovery on historical archives and/or real time datasets is currently limited by various data formats that have different projections and spatial resolution, requiring extensive data processing before analytics can be carried out. A new platform called Physical Analytics Integrated Repository and Services (PAIRS) is presented that enables rapid data discovery by automatically updating, joining, and homogenizing data layers in space and time. Built on top of open source big data software, PAIRS manages automatic data download, data curation, and scalable storage while being simultaneously a computational platform for running physical and statistical models on the curated datasets. By addressing data curation before data being uploaded to the platform, multi-layer queries and filtering can be performed in real time. In addition, PAIRS offers a foundation for developing custom analytics. Towards that end we present two examples with models which are running operationally: (1) high resolution evapo-transpiration and vegetation monitoring for agriculture and (2) hyperlocal weather forecasting driven by machine learning for renewable energy forecasting.
Keywords :
"Geospatial analysis","Satellites","Spatial resolution","Real-time systems","Data models","Weather forecasting"
Conference_Titel :
Big Data (Big Data), 2015 IEEE International Conference on
DOI :
10.1109/BigData.2015.7363884