• DocumentCode
    1882800
  • Title

    Rapid sequence identification of potential pathogens using techniques from sparse linear algebra

  • Author

    Dodson, Stephanie ; Ricke, Darrell O. ; Kepner, Jeremy ; Chiu, Nelson ; Shcherbina, Anna

  • Author_Institution
    MIT Lincoln Laboratory, Lexington, MA, U.S.A
  • fYear
    2015
  • fDate
    14-16 April 2015
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    The decreasing costs and increasing speed and accuracy of DNA sample collection, preparation, and sequencing has rapidly produced an enormous volume of genetic data. However, fast and accurate analysis of the samples remains a bottleneck. Here we present D4RAGenS, a genetic sequence identification algorithm that exhibits the Big Data handling and computational power of the Dynamic Distributed Dimensional Data Model (D4M). The method leverages linear algebra and statistical properties to increase computational performance while retaining accuracy by subsampling the data. Two run modes, Fast and Wise, yield speed and precision tradeoffs, with applications in biodefense and medical diagnostics. The D4RAGenS analysis algorithm is tested over several datasets, including three utilized for the Defense Threat Reduction Agency (DTRA) metagenomic algorithm contest.
  • Keywords
    IEEE Xplore; Portable document format;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Technologies for Homeland Security (HST), 2015 IEEE International Symposium on
  • Conference_Location
    Waltham, MA, USA
  • Print_ISBN
    978-1-4799-1736-5
  • Type

    conf

  • DOI
    10.1109/THS.2015.7225316
  • Filename
    7225316