Title :
An open schema for XML data in Hive
Author :
Wuheng Luo ; Bo Liu ; Watfa, Allie K.
Author_Institution :
Enterprise Data Archit., Sears Holdings Hoffman Estates, Hoffman Estates, IL, USA
Abstract :
Big data in XML format poses a challenge to distributed data systems. This paper proposes an open Hive schema approach to XML data placement in Hadoop. Placing XML data in Hive with this generic schema to build column-oriented and OLAP-focused XML data warehouse of heterogeneous content has benefits for data access, maintenance and scalability.
Keywords :
Big Data; XML; data mining; data warehouses; parallel processing; Big Data; Hadoop; OLAP-focused XML data warehouse; XML data placement; XML format; column-oriented XML data warehouse; data access; data maintenance; data scalability; distributed data systems; heterogeneous content; open Hive schema approach; Availability; Big data; Books; Companies; Data models; Distributed databases; XML; Hadoop; Hive; XML; column-oriented; open schema; schema-less;
Conference_Titel :
Big Data (Big Data), 2014 IEEE International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/BigData.2014.7004409