DocumentCode
3764518
Title
Term weighting using contextual information for categorization of unstructured text documents
Author
Anagha Kulkarni;Vrinda Tokekar;Parag Kulkarni
Author_Institution
Cummins COE, Pune, India
fYear
2015
Firstpage
1
Lastpage
4
Abstract
During categorization of text documents, term weighting assigns appropriate weights to different terms. All the terms having equal weights have different contribution in deciding context of the document. This paper proposes a novel concept of associating positional context among regions for term weighting. For this, Dynamic Partitioning of text documents with First and Last Partitions (DynaPart-FiLa) is proposed. Experiments show that associating positional context improves F-measure by 11.9% for Reuters-21578, 23.6% for talk.* Newsgroups and 34.82% for Reuters Corpus Volume I (RCV1) in comparison to traditional term weighting scheme. The performance improvement is at the expense of small additional storage cost.
Keywords
"Context","Support vector machines","Training","Complexity theory","Standards","Kernel","Text categorization"
Publisher
ieee
Conference_Titel
India Conference (INDICON), 2015 Annual IEEE
Electronic_ISBN
2325-9418
Type
conf
DOI
10.1109/INDICON.2015.7443216
Filename
7443216
Link To Document