• DocumentCode
    1004822
  • Title

    Distributed Indexing of Large-Scale Web Collections

  • Author

    Costa, Miguel ; Silva, Mário J.

  • Volume
    3
  • Issue
    1
  • fYear
    2005
  • fDate
    3/1/2005 12:00:00 AM
  • Firstpage
    2
  • Lastpage
    8
  • Abstract
    Sidra is a new indexing and ranking system for large-scale Web collections. Sidra creates multiple distributed indexes, organized and partitioned by different ranking criteria, aimed at supporting contextualized queries over hypertexts and their metadata. This paper presents the architecture of Sidra and the algorithms used to create its indexes. Performance measurements on the Portuguese Web data show that Sidra´s indexing times and scalability are comparable to those of global Web search engines.
  • Keywords
    Indexing; Web; search engines; Electronic switching systems; Indexing; Large-scale systems; Personal communication networks; Single event transient; Indexing; Web; search engines;
  • fLanguage
    English
  • Journal_Title
    Latin America Transactions, IEEE (Revista IEEE America Latina)
  • Publisher
    ieee
  • ISSN
    1548-0992
  • Type

    jour

  • DOI
    10.1109/TLA.2005.1468656
  • Filename
    1468656