DocumentCode
1004822
Title
Distributed Indexing of Large-Scale Web Collections
Author
Costa, Miguel ; Silva, Mário J.
Volume
3
Issue
1
fYear
2005
fDate
3/1/2005 12:00:00 AM
Firstpage
2
Lastpage
8
Abstract
Sidra is a new indexing and ranking system for large-scale Web collections. Sidra creates multiple distributed indexes, organized and partitioned by different ranking criteria, aimed at supporting contextualized queries over hypertexts and their metadata. This paper presents the architecture of Sidra and the algorithms used to create its indexes. Performance measurements on the Portuguese Web data show that Sidra´s indexing times and scalability are comparable to those of global Web search engines.
Keywords
Indexing; Web; search engines; Electronic switching systems; Indexing; Large-scale systems; Personal communication networks; Single event transient; Indexing; Web; search engines;
fLanguage
English
Journal_Title
Latin America Transactions, IEEE (Revista IEEE America Latina)
Publisher
ieee
ISSN
1548-0992
Type
jour
DOI
10.1109/TLA.2005.1468656
Filename
1468656
Link To Document