• DocumentCode
    2503528
  • Title

    Parallel Information Extraction on Shared Memory Multi-processor System

  • Author

    Shan, Jiulong ; Chen, Yurong ; Diao, Qian ; Zhang, Yimin

  • Author_Institution
    Intel China Res. Center, Intel China Res. Center, Beijing
  • fYear
    2006
  • fDate
    14-18 Aug. 2006
  • Firstpage
    311
  • Lastpage
    318
  • Abstract
    Text mining is one of the best solutions for today and the future´s information explosion. With the development of modern processor technologies, it will be a mass market desktop application in the many-core era. In text mining system, information extraction is a representative module and is the most compute intensive part. In this paper, we study the performance of parallel information extraction on shared memory multi-processor systems in order to gain some insights of such applications on the future´s many-core architecture. In implementation, conditional random fields (CRFs) algorithm is selected as the core of module information extraction. Based on the newest CRFs toolkit FlexCRFs, we make several serial optimizations and then parallelize it with MPI and System V. IPC/shm. We also conduct a detailed performance analysis of this parallel application on the target system
  • Keywords
    data mining; information retrieval; message passing; parallel processing; shared memory systems; CRFs toolkit; FlexCRF; MPI; System V IPC/shm; conditional random fields; many-core architecture; parallel information extraction; serial optimization; shared memory multiprocessor system; text mining; Bandwidth; Computer architecture; Data mining; Explosions; Humans; Internet; Performance analysis; Performance gain; Search engines; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing, 2006. ICPP 2006. International Conference on
  • Conference_Location
    Columbus, OH
  • ISSN
    0190-3918
  • Print_ISBN
    0-7695-2636-5
  • Type

    conf

  • DOI
    10.1109/ICPP.2006.58
  • Filename
    1690633