DocumentCode :
3511194
Title :
An Efficient Pretopological Approach for Document Clustering
Author :
Thanh Van Le ; Trong Nghia Truong ; Hong Nam Nguyen ; Tran Vu Pham
Author_Institution :
HCMC Univ. of Technol., Ho Chi Minh City, Vietnam
fYear :
2013
fDate :
9-11 Sept. 2013
Firstpage :
114
Lastpage :
120
Abstract :
In this paper we propose a new document clustering approach that does not require distance metric for aggregating multi variables in order to measure the similarity between documents. Based on calculated coherence (or pseudoclosure) function, closure set of pretopology concepts, we suggest a new proposition for cluster exploring when the connection between data is represented by one/many equivalent relations. Our approach also shows the data structure aggregating multi-criteria by viewing subsequent levels of pseudoclosure function which could represent data expansions. Furthermore, noisy data could be automatically detected by using our work as all elementary closures that have very small size of cardinality signifies their limit connections to the others. We also compare our approach with the popular K-Means and Fast Pair Nearest Neighbor with a given document collection for evaluating the efficiency and performance.
Keywords :
document handling; pattern clustering; unsupervised learning; calculated coherence function; cardinality size; cluster exploration; data connection; data expansions; data structure; document clustering approach; document similarity; fast pair nearest neighbor approach; k-means approach; pretopological approach; pseudoclosure function; Abstracts; Coherence; Data mining; Error analysis; Measurement; Topology; Vectors; closure; document clustering; multi-criteria; pretopology; pseudoclosure;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Networking and Collaborative Systems (INCoS), 2013 5th International Conference on
Conference_Location :
Xi´an
Type :
conf
DOI :
10.1109/INCoS.2013.25
Filename :
6630395
Link To Document :
بازگشت