• DocumentCode
    3059462
  • Title

    Boosting inductive transfer for text classification using wikipedia

  • Author

    Banerjee, Somnath

  • Author_Institution
    Hewlett-Packard Labs, Bangalore
  • fYear
    2007
  • fDate
    13-15 Dec. 2007
  • Firstpage
    148
  • Lastpage
    153
  • Abstract
    Inductive transfer is applying knowledge learned on one set of tasks to improve the performance of learning a new task. Inductive transfer is being applied in improving the generalization performance on a classification task using the models learned on some related tasks. In this paper, we show a method of making inductive transfer for text classification more effective using Wikipedia. We map the text documents of the different tasks to a feature space created using Wikipedia, thereby providing some background knowledge of the contents of the documents. It has been observed here that when the classifiers are built using the features generated from Wikipedia they become more effective in transferring knowledge. An evaluation on the daily classification task on the Reuters RCV1 corpus shows that our method can significantly improve the performance of inductive transfer. Our method was also able to successfully overcome a major obstacle observed in a recent work on a similar setting.
  • Keywords
    classification; content management; learning by example; text analysis; Wikipedia; document content; generalization performance; inductive transfer; knowledge transfer; text classification; text document mapping; Boosting; Discrete cosine transforms; Feeds; Image classification; Knowledge transfer; Machine learning; Text categorization; Training data; Wikipedia;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications, 2007. ICMLA 2007. Sixth International Conference on
  • Conference_Location
    Cincinnati, OH
  • Print_ISBN
    978-0-7695-3069-7
  • Type

    conf

  • DOI
    10.1109/ICMLA.2007.39
  • Filename
    4457223