• DocumentCode
    3764518
  • Title

    Term weighting using contextual information for categorization of unstructured text documents

  • Author

    Anagha Kulkarni;Vrinda Tokekar;Parag Kulkarni

  • Author_Institution
    Cummins COE, Pune, India
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    During categorization of text documents, term weighting assigns appropriate weights to different terms. All the terms having equal weights have different contribution in deciding context of the document. This paper proposes a novel concept of associating positional context among regions for term weighting. For this, Dynamic Partitioning of text documents with First and Last Partitions (DynaPart-FiLa) is proposed. Experiments show that associating positional context improves F-measure by 11.9% for Reuters-21578, 23.6% for talk.* Newsgroups and 34.82% for Reuters Corpus Volume I (RCV1) in comparison to traditional term weighting scheme. The performance improvement is at the expense of small additional storage cost.
  • Keywords
    "Context","Support vector machines","Training","Complexity theory","Standards","Kernel","Text categorization"
  • Publisher
    ieee
  • Conference_Titel
    India Conference (INDICON), 2015 Annual IEEE
  • Electronic_ISBN
    2325-9418
  • Type

    conf

  • DOI
    10.1109/INDICON.2015.7443216
  • Filename
    7443216