DocumentCode
3189054
Title
Statistical Approaches to Identifying Androgen Response Elements
Author
Li, Li ; Heber, Steffen ; Zhang, Qiang ; Andersen, Melvin E.
fYear
2007
fDate
28-31 Oct. 2007
Firstpage
95
Lastpage
100
Abstract
DNA-binding transcription factors play an integral role in regulating gene expression. Transcription factor binding sites (TFBS) in the gene promoter regions can be predicted by using computational methods, such as Support Vector Machine (SVM), Hidden Markov Model (HMM), and Random Forest (RF), all of which summarize sequence patterns of experimentally determined TFBSs. Androgen receptor (AR), a ligand-dependent transcription factor, plays an important role in male reproductive functions by regulating gene transcription through directly binding to androgen response elements (ARE) in target gene promoters. The aim of this study is to use data mining tools to identify and characterize AREs based on sequence information. Three statistical methods were explored to strengthen the prediction of putative AREs in the human genome. Cross-validation results indicated that all of the three models provided good sensitivity and specificity in identifying AREs, with an accuracy of at least 80%. It is the first time that HMM, SVM and RF have all been applied to constructing ARE prediction models.
Keywords
Bioinformatics; Data mining; Gene expression; Genomics; Hidden Markov models; Humans; Radio frequency; Radiofrequency identification; Statistical analysis; Support vector machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining Workshops, 2007. ICDM Workshops 2007. Seventh IEEE International Conference on
Conference_Location
Omaha, NE
Print_ISBN
978-0-7695-3019-2
Electronic_ISBN
978-0-7695-3033-8
Type
conf
DOI
10.1109/ICDMW.2007.81
Filename
4476652
Link To Document