• DocumentCode
    2182033
  • Title

    Compressing distributed text in parallel with (s, c)-dense codes

  • Author

    Bonacic, Carolina ; Fariña, Antonio ; Marín, Mauricio ; Brisaboa, Nieves R.

  • Author_Institution
    Departamento de Ciencia de la Computacion, Univ. de Chile, Chile
  • fYear
    2004
  • fDate
    11-12 Nov. 2004
  • Firstpage
    93
  • Lastpage
    98
  • Abstract
    Systems able to cope with very large text collections are making intensive use of distributed memory parallel computing platforms such as clusters of PCs. This is particularly evident in Web search engines which must resort to parallelism in order to deal efficiently with both high rates of queries per unit time and high space requirements in the form of large numbers of small documents stored in secondary memory. Those documents can be stored in compressed format to reduce memory space and communication time. This paper proposes a parallel algorithm for compressing text in such a distributed memory environment. We show efficient performance against the usual-practice alternative of compressing the whole text on a single machine.
  • Keywords
    data compression; distributed memory systems; parallel algorithms; text analysis; (s, c)-dense codes; Web search engines; compressed format; distributed memory environment; distributed memory parallel computing; distributed text compression; parallel algorithm; Compression algorithms; Concurrent computing; Databases; Distributed computing; Laboratories; Parallel algorithms; Parallel processing; Personal communication networks; Search engines; Web search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science Society, 2004. SCCC 2004. 24th International Conference of the Chilean
  • Print_ISBN
    0-7695-2185-1
  • Type

    conf

  • DOI
    10.1109/QEST.2004.6
  • Filename
    1372109