DocumentCode :
2348639
Title :
Parallel generation of inverted files for distributed text collections
Author :
Ribeiro-Neto, Berthier A. ; Kitajima, João Paulo ; Navarro, Gonzalo ; Ana, Cláudio R G Sant ; Ziviani, Nivio
Author_Institution :
Comput. Sci. Dept., Fed.. Univ. of Minas Gerais, Belo Horizonte, Brazil
fYear :
1998
fDate :
9-14 Nov 1998
Firstpage :
149
Lastpage :
157
Abstract :
The authors present a scalable algorithm for the parallel computation of inverted files for large text collections. The algorithm takes into account an environment of a high bandwidth network of workstations with a shared-nothing memory organization. The text collection is assumed to be evenly distributed among the disks of the various workstations. Compression is used to save space in main memory (where inverted lists are kept) and to save time when data have to be moved across the network. The algorithm´s average running cost is O(t/p) where t is the size of the whole text collection and p is the number of available processors. The authors implemented their algorithm and drew experimental results. In a 100 Mbits/s switched Ethernet network with 4 PentiumPro 200 MHz, 128 MB RAM on each processor they were able to invert 2 GB of TREC documents in 15 minutes. Further they also proposed an analytical model for the algorithm execution time
Keywords :
computational complexity; data compression; full-text databases; local area networks; parallel algorithms; PentiumPro; TREC documents; algorithm average running cost; algorithm execution time; analytical model; compression; distributed text collections; high bandwidth workstation network; large text collections; main memory; parallel inverted file generation; scalable algorithm; shared-nothing memory organization; switched Ethernet network; Computer science; Concurrent computing; Costs; Distributed computing; Hardware; Indexing; Information retrieval; Read-write memory; Switches; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science, 1998. SCCC '98. XVIII International Conference of the Chilean Society of
Conference_Location :
Antofogasta
Print_ISBN :
0-8186-8616-2
Type :
conf
DOI :
10.1109/SCCC.1998.730794
Filename :
730794
Link To Document :
بازگشت