Title :
High performance of artificial neural network for resolving ambiguous nucleotide problem
Author :
Plaimas, Kitiporn ; Lursinsap, Chidchanok ; Suratanee, Apichat
Author_Institution :
Dept. of Math., Chulalongkorn Univ., Bangkok, Thailand
Abstract :
The information of DNA sequence data - the string of symbol A, C, G, and T - is used to construct a resolving ambiguous symbol method on DNA sequence. The relative position that means nucleotides and their positions relating to their neighboring nucleotides of each strain is the feature extraction from the sequence for learning and prediction process by a neural network for each symbol. To recognize of all possible feature vectors, the large training set was divided into data subsets by the rule that feature vectors of each set has the same group of feature values in the key feature and the size of each set is small enough to produce the completely recognition network. As a result of the rule, the recognition network is consist of sub networks for each data subset. They can be simultaneously trained. Using this approach, we can obtain many sub optimal neural networks in place of one unacceptable network and can reduce the training-time and facilitate the recognition of large data sets.
Keywords :
DNA; biology computing; data analysis; neural nets; pattern recognition; DNA sequence; artificial neural network; feature extraction; nucleotide problem; Artificial intelligence; Artificial neural networks; Backpropagation; Bioinformatics; DNA; Feature extraction; Genomics; Neural networks; Sequences; Testing;
Conference_Titel :
Parallel and Distributed Processing Symposium, 2005. Proceedings. 19th IEEE International
Print_ISBN :
0-7695-2312-9
DOI :
10.1109/IPDPS.2005.245