DocumentCode :
3011663
Title :
An improved keyword extraction method using graph based random walk model
Author :
Islam, Md Rafiqul ; Islam, Md Rafiqul
Author_Institution :
Dept. of Comput. Sci. & Eng. Discipline, Khulna Univ., Khulna
fYear :
2008
fDate :
24-27 Dec. 2008
Firstpage :
225
Lastpage :
229
Abstract :
Keywords can be considered as condensed versions of documents, which can play important role in some text processing tasks such as text indexing, summarization and categorization. However, there are many digital documents especially on the Internet that do not have a list of assigned keywords. Assigning keywords to these documents manually is a difficult task and requires appropriate knowledge of the topic. Automatic keyword extraction process can solve this problem. In this paper, we introduce a new improved method for keyword extraction using random walk model by considering position of terms within the document and information gain of terms corresponds to the whole set of documents. We also incorporate mutual information (MI) of terms with random walk model to extract keywords from documents. The experiments on standard test collections show that our method outperforms the previously proposed methods.
Keywords :
graph theory; information retrieval; random processes; text analysis; digital document; document keyword assignment; graph-based random walk model; improved keyword extraction method; mutual information; text processing task; text summary; Casting; Citation analysis; Computer science; Data mining; Indexing; Information technology; Internet; Mutual information; Text processing; Voting; Keyword extraction; information gain; mutual information; random walk model; term position;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Information Technology, 2008. ICCIT 2008. 11th International Conference on
Conference_Location :
Khulna
Print_ISBN :
978-1-4244-2135-0
Electronic_ISBN :
978-1-4244-2136-7
Type :
conf
DOI :
10.1109/ICCITECHN.2008.4802967
Filename :
4802967
Link To Document :
بازگشت