DocumentCode :
3012401
Title :
An effective term weighting method using random walk model for text classification
Author :
Islam, Md Rafiqul ; Islam, Md Rafiqul
Author_Institution :
Dept. of Comput. Sci. & Eng. Discipline, Khulna Univ., Khulna
fYear :
2008
fDate :
24-27 Dec. 2008
Firstpage :
411
Lastpage :
414
Abstract :
Text classification may be viewed as assigning texts in a predefined set of categories. However there are many digital documents that are not organized according to their contents. So it is difficult task to find relevant documents for a user. Automatic text classification problem can solve this problem. In this paper we introduce a new random walk term weighting method for improved text classification. In our approach to weight a term, we exploit the relationship of local (term position, term frequency) and global (inverse document frequency, information gain) information of terms (vertices). Moreover, we weight terms by considering co-occurrence and semantic relation of terms as a measure of dependency. To evaluate our term weighting approach we integrate it in Rocchio text classification algorithm and experimental results show that our method performs better than other random walk models.
Keywords :
classification; graph theory; text analysis; Rocchio text classification algorithm; automatic text classification; digital document; random walk model; term weighting method; Algorithm design and analysis; Casting; Citation analysis; Classification algorithms; Computer science; Frequency; Information technology; Performance evaluation; Text categorization; Voting; Text classification; information gain; random walk model; semantic relation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Information Technology, 2008. ICCIT 2008. 11th International Conference on
Conference_Location :
Khulna
Print_ISBN :
978-1-4244-2135-0
Electronic_ISBN :
978-1-4244-2136-7
Type :
conf
DOI :
10.1109/ICCITECHN.2008.4803000
Filename :
4803000
Link To Document :
بازگشت