• DocumentCode
    2370475
  • Title

    CGM: A biomedical text categorization approach using concept graph mining

  • Author

    Bleik, Said ; Song, Min ; Smalter, Aaron ; Huan, Jun ; Lushington, Gerald

  • Author_Institution
    Dept. of Inf. Syst., New Jersey Inst. of Technol., Newark, NJ, USA
  • fYear
    2009
  • fDate
    1-4 Nov. 2009
  • Firstpage
    38
  • Lastpage
    43
  • Abstract
    Text Categorization is used to organize and manage biomedical text databases that are growing at an exponential rate. Feature representations for documents are a crucial factor for the performance of text categorization. Most of the successful existing techniques use a vector representation based on key entities extracted from the text. In this paper we investigate a new direction where we represent a document as a graph. In this representation we identify high level concepts and build a rich graph structure that contains additional concepts and relationships. We then use graph kernel techniques to perform text categorization. The results show a significant improvement in accuracy when compared to categorization based on only the extracted concepts.
  • Keywords
    data mining; medical computing; text analysis; biomedical text categorization approach; biomedical text databases; concept graph mining; graph kernel techniques; Data mining; Engineering management; Information retrieval; Kernel; Management information systems; Spatial databases; Technology management; Text categorization; Unified modeling language; User-generated content;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine Workshop, 2009. BIBMW 2009. IEEE International Conference on
  • Conference_Location
    Washington, DC
  • Print_ISBN
    978-1-4244-5121-0
  • Type

    conf

  • DOI
    10.1109/BIBMW.2009.5332134
  • Filename
    5332134