Title :
Improved H3K27ac histone mark prediction using k-mer proximity feature
Author :
Pui Kwan Fong;Nung Kion Lee
Author_Institution :
Faculty of Cognitive Sciences and Human Development, Universiti Malaysia Sarawak, Kota Samarahan, Malaysia
Abstract :
Prediction of gene regulatory elements-enhancers is computationally challenging because features associated with them are ill-understood. Several histone marks are known to be associated with enhancers locations and have been successfully used to predict multiple thousands of enhancers approximate locations. The k-mer (a short continuous nucleotides of length k) is one of the most commonly engineered features from histone sequences for machine learning task. However, usually large k-mer (i.e. 5 ≤ k ≤ 7) feature set is needed to perform well and no domain knowledge is used. In this study we proposed the k-mer proximity feature which is domain dependent to represent the H3K27ac histone enrichment in DNA sequences. This feature represents the spatial content of DNA sequences. We compare the performances of using the proximity and the k-mer feature for H3K27ac marks prediction and results indicate that the proposed feature gives higher prediction accuracy rates. These findings supported that the proximity feature is a more distinguishing feature of DNA sequences with histone modification enrichment.
Keywords :
"DNA","Feature extraction","Support vector machines","Data mining","Frequency conversion","Genomics"
Conference_Titel :
IT in Asia (CITA), 2015 9th International Conference on
DOI :
10.1109/CITA.2015.7349830