DocumentCode :
628151
Title :
Terms extraction from unstructured data silos
Author :
Lomotey, Richard K. ; Deters, Ralph
Author_Institution :
Dept. of Comput. Sci., Univ. of Saskatchewan, Saskatoon, SK, Canada
fYear :
2013
fDate :
2-6 June 2013
Firstpage :
19
Lastpage :
24
Abstract :
The major challenge that the big data era brings to the services computing landscape is debris of unstructured data. The high-dimensional data is in heterogeneous formats, schemaless, and requires multiple storage APIs is some cases. This situation has made it almost impractical to apply existing data mining techniques which are designed for schema-based data sources in a knowledge discovery in database (KDD) process. In this paper, a tool called TouchR is proposed which algorithmically relies on the Hidden Markov Model (HMM) to extract terms from data silos; specifically, distributed NoSQL databases- which we model as network graph. Our use case graph consists of storage nodes such as CouchDB, Neo4J, DynamoDB etc. The evaluation of TouchR shows high accuracy for terms extraction and organization.
Keywords :
SQL; data mining; distributed databases; document handling; graph theory; hidden Markov models; network theory (graphs); API; CouchDB; DynamoDB; HMM; KDD process; Neo4J; TouchR tool; data mining techniques; distributed NoSQL database; heterogeneous-schemaless high-dimensional data; hidden Markov model; knowledge discovery-in-database process; network graph; schema-based data sources; storage nodes; term extraction; term organization; unstructured data silos; Data mining; Dictionaries; Distributed databases; Feature extraction; Hidden Markov models; Mathematical model; Hidden Markov Model (HMM); NoSQL; Unstructured data mining; big data; terms extraction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
System of Systems Engineering (SoSE), 2013 8th International Conference on
Conference_Location :
Maui, HI
Print_ISBN :
978-1-4673-5596-4
Type :
conf
DOI :
10.1109/SYSoSE.2013.6575236
Filename :
6575236
Link To Document :
بازگشت