DocumentCode
2182033
Title
Compressing distributed text in parallel with (s, c)-dense codes
Author
Bonacic, Carolina ; Fariña, Antonio ; Marín, Mauricio ; Brisaboa, Nieves R.
Author_Institution
Departamento de Ciencia de la Computacion, Univ. de Chile, Chile
fYear
2004
fDate
11-12 Nov. 2004
Firstpage
93
Lastpage
98
Abstract
Systems able to cope with very large text collections are making intensive use of distributed memory parallel computing platforms such as clusters of PCs. This is particularly evident in Web search engines which must resort to parallelism in order to deal efficiently with both high rates of queries per unit time and high space requirements in the form of large numbers of small documents stored in secondary memory. Those documents can be stored in compressed format to reduce memory space and communication time. This paper proposes a parallel algorithm for compressing text in such a distributed memory environment. We show efficient performance against the usual-practice alternative of compressing the whole text on a single machine.
Keywords
data compression; distributed memory systems; parallel algorithms; text analysis; (s, c)-dense codes; Web search engines; compressed format; distributed memory environment; distributed memory parallel computing; distributed text compression; parallel algorithm; Compression algorithms; Concurrent computing; Databases; Distributed computing; Laboratories; Parallel algorithms; Parallel processing; Personal communication networks; Search engines; Web search;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science Society, 2004. SCCC 2004. 24th International Conference of the Chilean
Print_ISBN
0-7695-2185-1
Type
conf
DOI
10.1109/QEST.2004.6
Filename
1372109
Link To Document