DocumentCode :
1825024
Title :
Combining information extraction and text mining for cancer biomarker detection
Author :
Dawoud, Khaled ; Shang Gao ; Qabaja, Ala ; Karampelas, P. ; Alhajj, Reda
Author_Institution :
Dept. of Comput. Sci., Univ. of Calgary, Calgary, AB, Canada
fYear :
2013
fDate :
25-28 Aug. 2013
Firstpage :
948
Lastpage :
955
Abstract :
Information technology is advancing faster than anticipated. The amount of data captured and stored in electronic form by far exceeds the capabilities available for comprehensive analysis and effective knowledge discovery. There is always a need for new sophisticated techniques that could extract more of the knowledge hidden in the raw data collected continuously in huge repositories. Biomedicine and computational biology is one of the domains overwhelmed with huge amounts of data that should be carefully analyzed for valuable knowledge that may help uncovering many of the still unknown information related to various diseases threatening the human body. Biomarker detection is one of the areas which have received considerable attention in the research community. There are two sources of data that could be analyzed for biomarker detection, namely gene expression data and the rich literature related to the domain. Our research group has reported achievements analyzing both domains. In this paper, we concentrate on the latter domain by describing a powerful tool which is capable of extracting from the content of a repository (like PubMed) the parts related to a given specific domain like cancer, analyze the retrieved text to extract the key terms with high frequency, present the extracted terms to domain experts for selecting those most relevant to the investigated domain, retrieve from the analyzed text molecules related to the domain by considering the relevant terms, derive the network which will be analyzed to identify potential biomarkers. For the work described in this paper, we considered PubMed and extracted abstracts related to prostate and breast cancer. The reported results are promising; they demonstrate the effectiveness and applicability of the proposed approach.
Keywords :
biology computing; cancer; data mining; genetics; information retrieval; medical expert systems; medicine; molecular biophysics; storage management; text analysis; PubMed; biomedicine; breast cancer; cancer biomarker detection; comprehensive analysis; computational biology; data capture; data storage; domain experts; gene expression data; information extraction; information technology; knowledge discovery; knowledge extraction; prostate cancer; retrieved text analysis; text mining; text molecules; Abstracts; Buildings; Data mining; Diseases; Prostate cancer; Proteins; cancer biomarkers; information extraction; knowledge discovery; network analysis; text analysis; text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advances in Social Networks Analysis and Mining (ASONAM), 2013 IEEE/ACM International Conference on
Conference_Location :
Niagara Falls, ON
Type :
conf
Filename :
6785814
Link To Document :
بازگشت