DocumentCode
3453854
Title
Research on Text Clustering Algorithms
Author
Li Qun ; Huang Xinyuan
Author_Institution
Sch. of Inf. Sci. &Technol., Beijing Forestry Univ., Beijing, China
fYear
2010
fDate
27-28 Nov. 2010
Firstpage
1
Lastpage
3
Abstract
Web documents are enormous. Text clustering is to place the documents with the most words in common into the same cluster. Thus the web search engine can structure the large result set for a certain quest. In this article, we study three kinds of clustering algorithms, prototype based, density based and hierarchical clustering algorithms. We compare two typical algorithms, K-medoids and DBSCAN. The results show that the K-medoids is sensitive to the initial center point and the DBSCAN has a better performance.
Keywords
pattern clustering; query processing; search engines; text analysis; DBSCAN; K-medoids; Web document; Web search engine; density based clustering; hierarchical clustering algorithms; prototype based clustering; text clustering algorithm; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Films; Forestry; Noise; Partitioning algorithms;
fLanguage
English
Publisher
ieee
Conference_Titel
Database Technology and Applications (DBTA), 2010 2nd International Workshop on
Conference_Location
Wuhan
Print_ISBN
978-1-4244-6975-8
Electronic_ISBN
978-1-4244-6977-2
Type
conf
DOI
10.1109/DBTA.2010.5659055
Filename
5659055
Link To Document