DocumentCode
2753957
Title
Ontology-Based Feature Weighting for Biomedical Literature Classification
Author
He, Dan ; Wu, Xindong
Author_Institution
Dept. of Comput. Sci., Vermont Univ., Burlington, VT
fYear
2006
fDate
16-18 Sept. 2006
Firstpage
280
Lastpage
285
Abstract
Ontology-based methods have been applied to biomedical literature classification tasks recently. By mapping lexically different but semantically similar words into features in the domain ontology that underlies the words, we can achieve at least two benefits: the dimensionality of the feature space can be reduced effectively, and the semantic information that underlies the lexical words can be incorporated into the classification process, leading to better classification accuracies. In this paper, we propose an ontology-based feature weighting strategy for the biomedical literature classification problem. We assign weights to the features into which the lexical words are mapped, according to the structure of the domain ontology, and further optimize the weights using cross-validation. Our experiments on MEDLINE-indexed journal abstracts demonstrate that our method can achieve a significant improvement on the classification accuracies, especially when the classification task is hard
Keywords
biology computing; classification; ontologies (artificial intelligence); MEDLINE-indexed journal abstract; biomedical literature classification; lexical words; ontology-based feature weighting; Abstracts; Classification algorithms; Computer science; Frequency estimation; Information retrieval; Nearest neighbor searches; Ontologies; Text categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Reuse and Integration, 2006 IEEE International Conference on
Conference_Location
Waikoloa Village, HI
Print_ISBN
0-7803-9788-6
Type
conf
DOI
10.1109/IRI.2006.252426
Filename
4018503
Link To Document