DocumentCode :
2753047
Title :
Highly compressed multi-pattern string matching on the cell broadband engine
Author :
Zha, Xinyan ; Scarpazza, Daniele Paolo ; Sahni, Sartaj
Author_Institution :
Comput. & Inf. Sci. & Eng., Univ. of Florida, Gainesville, FL, USA
fYear :
2011
fDate :
June 28 2011-July 1 2011
Firstpage :
257
Lastpage :
264
Abstract :
With its 9 cores per chip, the IBM Cell/Broadband Engine (Cell) can deliver an impressive amount of compute power and benefit the string-matching kernels of network security, networkbusiness analytics and natural language processing applications. However, the available amount of main memory on the system limits the maximum size of the dictionary supported by the string matching solution. To counter that, we propose a technique that employs compressed Aho-Corasick automata to perform fast, exact multi-pattern string matching with very large dictionaries. Our technique achieves the remarkable compression factors of 1:34 and 1:58, respectively, on the memory representation of English-language dictionaries and random binary string dictionaries. We demonstrate a parallel implementation for the Cell processor that delivers a sustained throughput between 0.90 and 2.35 Gbps per Cell blade, while supporting dictionary sizes up to 9.2 Million average patterns per Gbyte of main memory, and exhibiting resilience to content-based attacks. This high dictionary density enables natural language applications of an unprecedented scale to run on a single server blade.
Keywords :
automata theory; multiprocessing systems; string matching; Aho-Corasick automata; English language dictionary; IBM Cell broadband engine; content-based attack; multipattern string matching; natural language processing application; network business analytics application; network security application; Automata; Computer architecture; Dictionaries; Engines; Microprocessors; Optimization; Throughput;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computers and Communications (ISCC), 2011 IEEE Symposium on
Conference_Location :
Kerkyra
ISSN :
1530-1346
Print_ISBN :
978-1-4577-0680-6
Electronic_ISBN :
1530-1346
Type :
conf
DOI :
10.1109/ISCC.2011.5983850
Filename :
5983850
Link To Document :
بازگشت