DocumentCode :
3533312
Title :
An integrated approach of sequence and text mining technology for the identification of transcription factor binding sites
Author :
Xiong, Yun ; Yang, Qing ; Qiu, Boren ; Zhu, Yangyong
Author_Institution :
Sch. of Comput. Sci., Fudan Univ., Shanghai
fYear :
2008
fDate :
3-5 Nov. 2008
Firstpage :
178
Lastpage :
184
Abstract :
The study of the complex mechanisms that regulated gene expression on the level of transcription is an important and challenging issue in post-genomic era. A crucial step is to identify transcription factor binding sites(TFBSs). However, the number of the known TFBSs is limited, and the accuracy of the state-of-the-art identification methods is still far from satisfactory. In this paper, a novel integrated method for mining transcription factor binding sites is presented, which combines the sequence data mining method with the text mining method. Therefore, the method can not only obtain the putative TFBSs from the sequence data sets, but also acquire the experimentally verified TFBSs from the literatures. To evaluate the performance of our method, several experiments have been tested on real data sets. The results show that our integrated method outperforms each of the algorithms alone, furthermore, exhibits superior accuracy than existing algorithms.
Keywords :
data mining; text analysis; sequence data mining; state-of-the-art identification methods; text mining technology; transcription factor binding sites; Bioinformatics; Biological control systems; Biological processes; Computer science; Data mining; Databases; Gene expression; Sequences; Testing; Text mining; binding site; bioinformatics; data mining; sequence mining; text mining; transcription factor;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomeidcine Workshops, 2008. BIBMW 2008. IEEE International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
978-1-4244-2890-8
Type :
conf
DOI :
10.1109/BIBMW.2008.4686233
Filename :
4686233
Link To Document :
بازگشت