• DocumentCode
    2851201
  • Title

    Classifying biomedical citations without labeled training examples

  • Author

    Li, Xiaoli ; Joshi, Rohit ; Ramachandaran, Sreeram ; Leong, Tze-Yun

  • Author_Institution
    Comput. Sci. Program, Singapore MIT Alliance, Singapore
  • fYear
    2004
  • fDate
    1-4 Nov. 2004
  • Firstpage
    455
  • Lastpage
    458
  • Abstract
    In this paper we introduce a novel technique for classifying text citations without labeled training examples. We first utilize the search results of a general search engine as original training data. We then proposed a mutually reinforcing learning algorithm (MRL) to mine the classification knowledge and to "clean" the training data. With the help of a set of established domain-specific ontological terms or keywords, the MRL mining step derives the relevant classification knowledge. The MRL cleaning step then builds a naive Bayes classifier based on the mined classification knowledge and tries to clean the training set. The MRL algorithm is iteratively applied until a clean training set is obtained. We show the effectiveness of the proposed technique in the classification of biomedical citations from a large medical literature database.
  • Keywords
    Bayes methods; citation analysis; classification; learning (artificial intelligence); medical information systems; biomedical citation classification; classification knowledge; domain-specific ontologies; labeled training examples; mutually reinforcing learning algorithm; naive Bayes classifier; search engine; training data; Biomedical computing; Cancer; Cleaning; Diseases; Iterative algorithms; Labeling; Ontologies; Search engines; Text categorization; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
  • Print_ISBN
    0-7695-2142-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2004.10039
  • Filename
    1410334