DocumentCode
547309
Title
An improved sIB algorithm for document clustering using combination weighting measures
Author
Ji, Bo ; Ye, Yangdong
Author_Institution
Sch. of Inf. Eng., Zhengzhou Univ., Zhengzhou, China
Volume
3
fYear
2011
fDate
10-12 June 2011
Firstpage
110
Lastpage
114
Abstract
This paper presents an improved sIB algorithm (CW-sIB) for high dimension document clustering using combination weighting. Traditionally, feature weighting researches on clustering devote themselves to search one single effective weighting scheme. However, how to choose a proper weighting scheme is a generally acknowledged devilish problem. To address this issue, we propose the linear combination weighting method derived from the idea of combination evaluation for multiple attribute decision making problem. The application of combination weighting can overcome the limitations of using single weighting scheme. It will help to reflect the essential characteristics of the document data better. The experiments on real document data have shown that the proposed CW-sIB algorithm is superior to the sIB algorithm. Meanwhile, we report results as to which combination of weighting scheme elements show merit in the decomposition of datasets.
Keywords
decision making; pattern clustering; text analysis; CW-sIB algorithm; combination weighting measures; document clustering; feature weighting; improved sIB algorithm; information retrieval; linear combination weighting method; machine learning; multiple attribute decision making problem; text categorization; text mining methods; Accuracy; Clustering algorithms; Complexity theory; Indexes; Partitioning algorithms; Text categorization; Weight measurement;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Automation Engineering (CSAE), 2011 IEEE International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-4244-8727-1
Type
conf
DOI
10.1109/CSAE.2011.5952644
Filename
5952644
Link To Document