• DocumentCode
    1707780
  • Title

    Dart: A Geographic Information System on Hadoop

  • Author

    Hong Zhang ; Zhibo Sun ; Zixia Liu ; Chen Xu ; Liqiang Wang

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Wyoming, Laramie, WY, USA
  • fYear
    2015
  • Firstpage
    90
  • Lastpage
    97
  • Abstract
    In the field of big data research, analytics on spatio-temporal data from social media is one of the fastest growing areas and poses a major challenge on research and application. An efficient and flexible computing and storage platform is needed for users to analyze spatio-temporal patterns in huge amount of social media data. This paper introduces a scalable and distributed geographic information system, called Dart, based on Hadoop and HBase. Dart provides a hybrid table schema to store spatial data in HBase so that the Reduce process can be omitted for operations like calculating the mean center and the median center. It employs reasonable pre-splitting and hash techniques to avoid data imbalance and hot region problems. It also supports massive spatial data analysis like K-Nearest Neighbors (KNN) and Geometric Median Distribution. In our experiments, we evaluate the performance of Dart by processing 160 GB Twitter data on an Amazon EC2 cluster. The experimental results show that Dart is very scalable and efficient.
  • Keywords
    Big Data; data analysis; geographic information systems; parallel programming; Amazon EC2 cluster; Big Data research; Dart; HBase; Hadoop; Twitter data; geometric median distribution; hash techniques; hybrid table schema; k-nearest neighbors; massive spatial data analysis; pre-splitting techniques; scalable distributed geographic information system; social media data; spatio-temporal data; Algorithm design and analysis; Computational modeling; Data analysis; Geographic information systems; Media; Spatial databases; Twitter; GIS; Hadoop; Hbase; KNN; Mean Center; Median Center; Social Network;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing (CLOUD), 2015 IEEE 8th International Conference on
  • Conference_Location
    New York City, NY
  • Print_ISBN
    978-1-4673-7286-2
  • Type

    conf

  • DOI
    10.1109/CLOUD.2015.22
  • Filename
    7214032