• DocumentCode
    2166410
  • Title

    Distributed parallel generation of indices for very large text databases

  • Author

    Kitajima, J.P. ; Resende, M.D. ; Ribeiro-Neto, B. ; Zivian, N.

  • Author_Institution
    Dept. de Ciencia da Comput., Univ. Fed. de Minas Gerais, Belo Horizonte, Brazil
  • fYear
    1997
  • fDate
    10-12 Dec 1997
  • Firstpage
    745
  • Lastpage
    752
  • Abstract
    We propose a new algorithm for the parallel generation of suffix arrays for large text databases on high-bandwidth computer networks. Suffix arrays are structures used in full text indexing which support very powerful query languages. Our algorithm is based on a parallel indirect mergesort (it is not a simple mergesort procedure) and is compared with a well known sequential algorithm (which is very efficient running on a single machine). Although network-bounded, the parallel version is theoretically and experimentally a much better alternative when compared to the sequential version (which is I/O-bounded in disk)
  • Keywords
    parallel algorithms; distributed parallel index generation; full text indexing; high-bandwidth computer networks; parallel algorithm; parallel indirect mergesort; query languages; sequential algorithm; suffix arrays; very large text databases; Abstracts; Computer networks; Constraint theory; Database languages; Distributed databases; Indexing; Information retrieval; Information systems; Power system modeling; Workstations;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Algorithms and Architectures for Parallel Processing, 1997. ICAPP 97., 1997 3rd International Conference on
  • Conference_Location
    Melbourne, Vic.
  • Print_ISBN
    0-7803-4229-1
  • Type

    conf

  • DOI
    10.1109/ICAPP.1997.651539
  • Filename
    651539