DocumentCode
2503528
Title
Parallel Information Extraction on Shared Memory Multi-processor System
Author
Shan, Jiulong ; Chen, Yurong ; Diao, Qian ; Zhang, Yimin
Author_Institution
Intel China Res. Center, Intel China Res. Center, Beijing
fYear
2006
fDate
14-18 Aug. 2006
Firstpage
311
Lastpage
318
Abstract
Text mining is one of the best solutions for today and the future´s information explosion. With the development of modern processor technologies, it will be a mass market desktop application in the many-core era. In text mining system, information extraction is a representative module and is the most compute intensive part. In this paper, we study the performance of parallel information extraction on shared memory multi-processor systems in order to gain some insights of such applications on the future´s many-core architecture. In implementation, conditional random fields (CRFs) algorithm is selected as the core of module information extraction. Based on the newest CRFs toolkit FlexCRFs, we make several serial optimizations and then parallelize it with MPI and System V. IPC/shm. We also conduct a detailed performance analysis of this parallel application on the target system
Keywords
data mining; information retrieval; message passing; parallel processing; shared memory systems; CRFs toolkit; FlexCRF; MPI; System V IPC/shm; conditional random fields; many-core architecture; parallel information extraction; serial optimization; shared memory multiprocessor system; text mining; Bandwidth; Computer architecture; Data mining; Explosions; Humans; Internet; Performance analysis; Performance gain; Search engines; Text mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Processing, 2006. ICPP 2006. International Conference on
Conference_Location
Columbus, OH
ISSN
0190-3918
Print_ISBN
0-7695-2636-5
Type
conf
DOI
10.1109/ICPP.2006.58
Filename
1690633
Link To Document