• DocumentCode
    3738573
  • Title

    Predicting E. Coli promoters using formal grammars

  • Author

    Aljoharah Algwaiz;Sanguthevar Rajasekaran;Reda Ammar

  • Author_Institution
    Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA
  • fYear
    2015
  • Firstpage
    544
  • Lastpage
    547
  • Abstract
    Ever since the structure of the DNA was discovered, linguistics has been part of molecular biology [13]. Grammatical linguistics is a powerful method to express information and describe its structure. It can be used to express transcribed in DNAs. Most formal grammar applications on DNAs are based on Searls DNA parsing approach using Prolog-based Definite Clause Grammars (DFG) [11]. Extensions of this approach include String Variable Grammar [6] and Basic Gene Grammars [5]. This paper presents a novel approach by parsing Escherichia Coli (E. Coli) promoter sequences using a Context-Free Grammar (CFG). The approach takes advantage of an error correcting parsing algorithm introduced by Rajasekaran and Nicolae [1]. The idea is to derive a grammar for known promoter regions and then modify this grammar to tolerate errors. The resulting cover grammar can then be employed to recognize promoter regions. Introducing probabilities in the production rules can further extend the cover grammar. Please note that in this paper we introduce this novel paradigm. In our future work we will implement this approach and test it on various datasets.
  • Keywords
    "Grammar","DNA","Standards","Production","Pragmatics","Polymers","Probabilistic logic"
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing and Information Technology (ISSPIT), 2015 IEEE International Symposium on
  • Type

    conf

  • DOI
    10.1109/ISSPIT.2015.7394396
  • Filename
    7394396