• DocumentCode
    3175355
  • Title

    Parallel Lexical Analyzer on the Cell Processor

  • Author

    Srikanth, Umarani G.

  • Author_Institution
    Comput. Sci. & Eng. Dept., S.A.Eng. Coll., Chennai, India
  • fYear
    2010
  • fDate
    9-11 June 2010
  • Firstpage
    28
  • Lastpage
    29
  • Abstract
    Pattern matching or finding the occurrences of a pattern in a text arises frequently in many applications. The task of splitting the character stream or text into words is called tokenization. Search engines use tokenizers. The first phase of a compiler outputs a stream of tokens of the given high-level language program. The pattern rules are specified as regular expressions. Many tools have been developed in the past that generate the tokenizer automatically which are mostly sequential. The advent of multicore architectures has made it mandatory to use its features like multiple threads and SIMD instructions in generating software tools. This works attempts to parallelize tokenization. This is a simple prototype implementation of a parallelized lexical analyzer that recognizes the tokens of the given source code. Each Synergetic Processing Element(SPE) of the cell processor works on a block of source code and tokenizes them independently. The Power Processing Unit(PPE) is responsible for splitting the source code into a finite number of blocks to be used by the different processing elements. Each SPE sends the stream of identifiers to the PPE which maintains the symbol table. The parallel lexical analyzer developed runs on IBM Cell Processor simulator and the execution times are plotted varying the code size and the number of processing elements.
  • Keywords
    coprocessors; high level languages; multiprocessing programs; parallelising compilers; pattern matching; search engines; software tools; source coding; cell processor; compiler; high-level language; parallel lexical analyzer; pattern matching; power processing unit; search engines; software tools; source code; synergetic processing element; tokenization; Computer architecture; High level languages; Multicore processing; Pattern matching; Program processors; Prototypes; Search engines; Software prototyping; Software tools; Yarn; Aho-Corasick algorithm; Cell Procesor; Lexical analyser; Multicore architecture;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Secure Software Integration and Reliability Improvement Companion (SSIRI-C), 2010 Fourth International Conference on
  • Conference_Location
    Singapore
  • Print_ISBN
    978-1-4244-7644-2
  • Type

    conf

  • DOI
    10.1109/SSIRI-C.2010.16
  • Filename
    5521554