• DocumentCode
    2258098
  • Title

    Parallel indexing in a Chinese information retrieval system

  • Author

    Wong, Kam-Fai ; Lum, Vincent Y.

  • Author_Institution
    Dept. of Syst. Eng., Chinese Univ. of Hong Kong, Shatin, Hong Kong
  • fYear
    1994
  • fDate
    9-11 Nov 1994
  • Firstpage
    320
  • Lastpage
    325
  • Abstract
    The increasing data size in Chinese information-based applications renders conventional information retrieval (IR) systems unsuitable. This is because they are limited in both storage and speed. To overcome these predicaments, a parallel Chinese IR system (CIR) has been designed. It is being developed on a SIMD parallel computer, DECmpp, which is configured with 8,192 processing elements. It uses full inverted indices for retrieval. The “divide-and-conquer” principle is exercised in exploiting data parallelism in the inverted index files. The inverted indices are first partitioned into fragments. Each fragment is then assigned to an individual processing elements. Thereafter, during an index retrieval operation, all index fragments are searched in parallel. Although the principle is simple, realising the parallel indexing algorithm in a naive fashion (i.e. without considering the underlying parallel architecture) would result in poor retrieval performance. During the design of the CIR system, 3 different implementation models for parallel indexing have been considered. In this paper, qualitative evaluation of the 3 models is presented. Based on the result of the evaluation, the model that offers the best run-time performance was adopted
  • Keywords
    DEC computers; indexing; information retrieval systems; parallel processing; Chinese information retrieval system; DECmpp massively parallel processor; SIMD parallel computer; data parallelism; divide-and-conquer principle; full inverted indices; index fragments; parallel indexing; partitioned indexes; retrieval performance; run-time performance; Application software; Artificial intelligence; Concurrent computing; Engines; Indexing; Information retrieval; Information systems; Parallel architectures; Parallel processing; Systems engineering and theory;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Software and Applications Conference, 1994. COMPSAC 94. Proceedings., Eighteenth Annual International
  • Conference_Location
    Taipei
  • Print_ISBN
    0-8186-6705-2
  • Type

    conf

  • DOI
    10.1109/CMPSAC.1994.342784
  • Filename
    342784