Author_Institution :
Coll. of Comput. Sci. & Technol., Zhejiang Univ., Hangzhou, China
Abstract :
Nowadays, unstructured data, e.g., texts, images, and videos, is growing in an explosive speed with the development of Internet and social network. Due to the variety of unstructured data, it is strongly desirable to design a generalized model to represent all kinds of unstructured data and build a system to organize them effectively. In this paper, we first define a generalized data model to represent unstructured data. Above the data model, we further propose RAISE, a whole process modeling method including Repository, Analysis, Index, Search, and Environment. Furthermore, we design a SQL-like unstructured query language (UQL) for flexible accessing the RAISE model. We implement the proposed method in a distributed unstructured data management system named D-Ocean, which is scalable, reliable, and high-available.
Keywords :
SQL; data models; database indexing; distributed databases; D-Ocean; RAISE model; SQL-like unstructured query language; UQL; distributed unstructured data management system; generalized data model; process modeling method; repository-analysis-index-search-and-environment; unstructured data representation; Analytical models; Data models; Database languages; Distributed databases; Feature extraction; Indexes; Videos; unstructured data management; whole process modeling; unstructured query language; D-Ocean;