DocumentCode :
3087696
Title :
Extraction and Grounding of Protein Mutations via Semantic Integration of Text and Sequence Information
Author :
Baker, Christopher ; Kanagasabai, Rajaraman
Author_Institution :
Univ. of New Brunswick, St. John, NB, Canada
fYear :
2011
fDate :
22-25 March 2011
Firstpage :
556
Lastpage :
563
Abstract :
Rich information on mutations and their impacts is scattered across scientific texts and literature. Reuse of mutation impact annotations requires grounding mutations to the correct positions on sequences extracted from protein databases as a critical step. This paper presents a generic method for grounding textual mentions of mutation entities to protein sequences, that is based on an OWL-DL ontology driven workflow that integrates text and sequence information in a semantically consistent way. Mutation mentions mined from texts are iteratively mapped onto candidate proteins, and an ontology mining algorithm facilitates their correct grounding to a protein sequence. Using a gold standard corpus of full text articles and corresponding protein sequences we show the proposed method is promising compared to existing approaches.
Keywords :
biology computing; data mining; knowledge representation languages; text analysis; OWL-DL ontology; ontology mining algorithm; protein databases; protein mutations grounding; semantic integration; sequence information; text information; Databases; Grounding; Ontologies; Protein sequence; Text mining; Mutation Extraction; Mutation Grounding; Ontologies; Sequence Analysis; Text Mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Information Networking and Applications (AINA), 2011 IEEE International Conference on
Conference_Location :
Biopolis
ISSN :
1550-445X
Print_ISBN :
978-1-61284-313-1
Electronic_ISBN :
1550-445X
Type :
conf
DOI :
10.1109/AINA.2011.112
Filename :
5763475
Link To Document :
بازگشت