• DocumentCode
    1607445
  • Title

    Towards optimal keyword-based content dissemination in DHT-based P2P networks

  • Author

    Rao, Weixiong ; Vitenberg, Roman ; Tarkoma, Sasu

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Helsinki, Helsinki, Finland
  • fYear
    2011
  • Firstpage
    102
  • Lastpage
    111
  • Abstract
    Keyword-based content alert services, e.g., Google Alerts and Microsoft Live Alerts, empower the end users with the ability to automatically receive useful and most recent content. In this paper, we leverage the favorable properties of DHTs, such as scalability, and propose a design of a scalable keyword-based content alert service. The DHT-based architecture matches textual documents with queries based on document terms: For each term, the implementation assigns a home node that is responsible for handling documents and queries that contain the term. The main challenge of this keyword-based matching scheme is the high number of terms that appear in a typical document resulting in a high publication cost. Fortunately, a document can be forwarded to the home nodes of a carefully selected subset of terms without incurring false negatives. In this paper we focus on the MTAF problem of minimizing the number of selected terms to forward the published content. We show that the problem is NP-hardness, and consider centralized and DHT-based solutions. Experimental results based on real datasets indicate that the proposed solutions are efficient compared to existing approaches. In particular, the similarity-based replication of filters that is a key element of our solution is shown to mitigate the effect of hotspots that arise due to the fact that some document terms are substantially more popular than the others, both inside documents and queries.
  • Keywords
    file organisation; peer-to-peer computing; DHT-based architecture; MTAF problem; NP-hardness; P2P network; distributed hash table; keyword-based matching scheme; optimal keyword-based content dissemination; scalable keyword-based content alert service; similarity-based replication; Data models; IEEE Communications Society; Maintenance engineering; Peer to peer computing; Publishing; Registers; Vegetation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Peer-to-Peer Computing (P2P), 2011 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    2161-3559
  • Print_ISBN
    978-1-4577-0150-4
  • Electronic_ISBN
    2161-3559
  • Type

    conf

  • DOI
    10.1109/P2P.2011.6038667
  • Filename
    6038667