DocumentCode
431002
Title
A modified Kohonen network for DNA splice junction classification
Author
Naenna, Thanakorn ; Bress, Robert A. ; Embrechts, Mark J.
Author_Institution
Dept. of Ind. Eng., Mahidol Univ., Nakornpathom, Thailand
Volume
B
fYear
2004
fDate
21-24 Nov. 2004
Firstpage
215
Abstract
This paper describes an application of Kohonen network, self-organizing maps (SOMs), for exon/intron classification in DNA using windowed splice junction data. Splice junctions are groups of nucleotides that serve as boundaries between sections of DNA that code for genetic material and sections that do not. Genes are often interrupted by sections of noncoding DNA sequences. The data used for this study is human DNA data taken from the National Center for Bioinformatics Information (http://www.ncbi.nih.gov/). The DNA dataset contains 1,424 DNA sequences with 128 descriptors for each sequence. SOMs were used to classify each DNA sequence into three categories that are sequences that transition from gene (exon) to nongene (intron), nongene (intron) to gene (exon), and no transition categories where the two-basepair code for the splice junction was coincidental. The multidimensional sequences are clustered into a two-dimensional space that was graphically displayed for data exploration and classification. Visual and graphical capabilities of SOMs are applied to classify the DNA dataset. The topographic properties of SOMs preserve similar sequences close to each other on the output map. Clusters of the dataset are determined and labeled based on the classes of the output neuron in the cluster. The highest frequency classes mapped on the output neuron are labeled as the classes of the output neurons.
Keywords
DNA; biology computing; biotechnology; genetics; pattern classification; self-organising feature maps; DNA sequences; DNA splice junction classification; Kohonen network; bioinformatics; gene; neurons; self-organizing maps; Bioinformatics; Biological materials; DNA; Frequency; Genetics; Humans; Multidimensional systems; Neurons; Self organizing feature maps; Sequences;
fLanguage
English
Publisher
ieee
Conference_Titel
TENCON 2004. 2004 IEEE Region 10 Conference
Print_ISBN
0-7803-8560-8
Type
conf
DOI
10.1109/TENCON.2004.1414570
Filename
1414570
Link To Document