• DocumentCode
    710237
  • Title

    High-Performance Biomedical Association Mining with MapReduce

  • Author

    Yanqing Ji ; Yun Tian ; Fangyang Shen ; Tran, John

  • Author_Institution
    Gonzaga Univ., Spokane, WA, USA
  • fYear
    2015
  • fDate
    13-15 April 2015
  • Firstpage
    465
  • Lastpage
    470
  • Abstract
    MapReduce has been applied to data-intensive applications in different domains because of its simplicity, scalability and fault-tolerance. However, its uses in biomedical association mining are still very limited. In this paper, we investigate using MapReduce to efficiently mine the associations between biomedical terms extracted from a set of biomedical articles. First, biomedical terms were obtained by matching text to Unified Medical Language System (UMLS) Metathesaurus, a biomedical vocabulary and standard database. Then we developed a MapReduce algorithm that could be used to calculate a category of interestingness measures defined on the basis of a 2×2 contingency table. This algorithm consists of two MapReduce jobs and takes a stripes approach to reduce the number of intermediate results. Experiments were conducted using Amazon Elastic MapReduce (EMR) with an input of 3610 articles retrieved from two biomedical journals. Test results indicate that our algorithm has linear scalability.
  • Keywords
    data mining; distributed processing; medical computing; Amazon Elastic MapReduce; EMR; MapReduce algorithm; UMLS; biomedical articles; biomedical terms; biomedical vocabulary; data intensive applications; high-performance biomedical association mining; linear scalability; metathesaurus; standard database; text matching; unified medical language system; Biomedical measurement; Clustering algorithms; Data mining; Databases; Servers; Standards; Unified modeling language; Association Mining; Biomedical Literature; High-Performance Computing; MapReduce;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology - New Generations (ITNG), 2015 12th International Conference on
  • Conference_Location
    Las Vegas, NV
  • Print_ISBN
    978-1-4799-8827-3
  • Type

    conf

  • DOI
    10.1109/ITNG.2015.80
  • Filename
    7113516