DocumentCode
2370475
Title
CGM: A biomedical text categorization approach using concept graph mining
Author
Bleik, Said ; Song, Min ; Smalter, Aaron ; Huan, Jun ; Lushington, Gerald
Author_Institution
Dept. of Inf. Syst., New Jersey Inst. of Technol., Newark, NJ, USA
fYear
2009
fDate
1-4 Nov. 2009
Firstpage
38
Lastpage
43
Abstract
Text Categorization is used to organize and manage biomedical text databases that are growing at an exponential rate. Feature representations for documents are a crucial factor for the performance of text categorization. Most of the successful existing techniques use a vector representation based on key entities extracted from the text. In this paper we investigate a new direction where we represent a document as a graph. In this representation we identify high level concepts and build a rich graph structure that contains additional concepts and relationships. We then use graph kernel techniques to perform text categorization. The results show a significant improvement in accuracy when compared to categorization based on only the extracted concepts.
Keywords
data mining; medical computing; text analysis; biomedical text categorization approach; biomedical text databases; concept graph mining; graph kernel techniques; Data mining; Engineering management; Information retrieval; Kernel; Management information systems; Spatial databases; Technology management; Text categorization; Unified modeling language; User-generated content;
fLanguage
English
Publisher
ieee
Conference_Titel
Bioinformatics and Biomedicine Workshop, 2009. BIBMW 2009. IEEE International Conference on
Conference_Location
Washington, DC
Print_ISBN
978-1-4244-5121-0
Type
conf
DOI
10.1109/BIBMW.2009.5332134
Filename
5332134
Link To Document