DocumentCode :
1979968
Title :
Dynamic semantic textual document clustering using frequent terms and named entity
Author :
Yafooz, Wael M. S. ; Abidin, Siti Z. Z. ; Omar, Normaliza ; Halim, Rosenah A.
Author_Institution :
Fac. of Comput. & Math. Sci., Univ. Teknol. MARA, Shah Alam, Malaysia
fYear :
2013
fDate :
19-20 Aug. 2013
Firstpage :
336
Lastpage :
340
Abstract :
Data is mostly stored in digital format rather than hard copy because the former is safer, more secure, smaller in size, and faster to retrieve than the latter. With the increasing number of electronic documents to be organized for users to obtain knowledge and integrate information, document clustering has been applied by grouping textual documents based on their similarities. Many attempts have been made to perform textual document clustering with highly accurate results (i.e., close to nature classes) and high processing performance. However, such proposed techniques work in batch (or static) mode in which performance tend to be sacrificed with the use of all the terms in the document, at times resulting in overlapping or scalability issues. Few studies that focus on dynamic clustering also reported on performance issues. This paper contributes in the investigation of textual document clustering approaches and highlights the importance of using dynamic clustering in mining frequent terms with included named entity. This method is used to achieve high efficiency and high-quality data clustering. The method is also beneficial to be used in textual document clustering algorithms for many text domain applications.
Keywords :
data mining; pattern clustering; text analysis; data storage; document similarities; dynamic semantic textual document clustering; electronic documents; frequent term mining; named entity; text domain applications; textual document grouping; Algorithm design and analysis; Clustering algorithms; Conferences; Data mining; Partitioning algorithms; Semantics; Systems engineering and theory; document clustering; dynamic textual clustering; frequent term; named entity;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
System Engineering and Technology (ICSET), 2013 IEEE 3rd International Conference on
Conference_Location :
Shah Alam
Print_ISBN :
978-1-4799-1028-1
Type :
conf
DOI :
10.1109/ICSEngT.2013.6650195
Filename :
6650195
Link To Document :
بازگشت