Title of article :
Mining Association Rules From Biological Databases
Author/Authors :
Andres Rodriguez-Gonzalez، نويسنده , , Jose-Maria Carazo، نويسنده , , Oswaldo Trelles، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2005
Abstract :
We present a novel application of knowledge discovery
technology to a developing and challenging application
area such as bioinformatics. This methodology allows
the identification of relationships between low-magnitude
similarity (LMS)sequence patterns and other wellcontrasted
protein characteristics, such as those
described by database annotations. We start with the
identification of these signals inside protein sequences
by exhaustive database searching and automatic pattern
recognition strategies. In a second step we address the
discovering of association rules that will allow tagging
sequences that hold LMS signals with consequent functional
keywords.We have designed our own algorithm for
discovering association rules, meeting the special necessities
of bioinformatics problems, where the patterns we
search lie in sparse datasets and are uncommon and thus
difficult to locate. Computational efficiency has been verified
both with synthetic and real biological data showing
that the algorithm is well suited to this application area
compared to state of the art algorithms. The usefulness of
the method is confirmed by its ability to produce previously
unknown and useful knowledge in the area of biological
sequence analysis. In addition, we introduce a
new and promising application of the rule extraction
algorithm on gene expression databases
Journal title :
Journal of the American Society for Information Science and Technology
Journal title :
Journal of the American Society for Information Science and Technology