• DocumentCode
    1784915
  • Title

    Predicting protein localization using a domain adaptation na¨ıve Bayes classifier with burrows wheeler transform features

  • Author

    Herndon, Nic ; Tangirala, Karthik ; Caragea, Doina

  • Author_Institution
    Kansas State Univ., Manhattan, KS, USA
  • fYear
    2014
  • fDate
    2-5 Nov. 2014
  • Firstpage
    501
  • Lastpage
    504
  • Abstract
    The reduced cost of the next generation sequencing technologies provides opportunities to study non-model organisms. However, one challenge is the large volume of data generated and, thus, the need to use automated approaches to annotate these data. Machine learning algorithms could provide a cost-effective solution but they need lots of labeled data and informative features to represent these data. Our proposed approach addresses both these problems by using a domain adaptation classifier in conjunction with features generated with unsupervised techniques to annotate biological sequence data.
  • Keywords
    Bayes methods; adaptive systems; bioinformatics; classification; data analysis; data structures; feature extraction; learning (artificial intelligence); macromolecules; molecular biophysics; molecular configurations; proteins; sequences; transforms; Burrows-Wheeler transform features; automated data annotation; biological sequence data annotation; cost-effective solution; data generation; data labeling; data representation; data volume; domain adaptation naive Bayes classifier; feature generation; informative feature; machine learning algorithm; next generation sequencing technology cost; nonmodel organism; protein localization prediction; unsupervised annotation; Accuracy; Algorithm design and analysis; Bioinformatics; Classification algorithms; Prediction algorithms; Proteins;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine (BIBM), 2014 IEEE International Conference on
  • Conference_Location
    Belfast
  • Type

    conf

  • DOI
    10.1109/BIBM.2014.6999209
  • Filename
    6999209