• DocumentCode
    1868103
  • Title

    Users, Queries and Documents: A Unified Representation for Web Mining

  • Author

    Diligenti, Michelangelo ; Gori, Marco ; Maggini, Marco

  • Volume
    1
  • fYear
    2009
  • fDate
    15-18 Sept. 2009
  • Firstpage
    238
  • Lastpage
    244
  • Abstract
    The collective feedback of the users of an Information Retrieval system has been proved to be useful in many tasks. A popular approach in the literature is to process the logs stored by Internet Service Providers (ISP), Intranet proxies or Web search engines to extract a query-document bi-partite graph. In this paper, we propose to use a richer data structure which is able to preserve most of the information available in the logs including query refinements, page visits and search activity. In particular, we represent the query refinements as separate transitions between the corresponding query nodes in the graph and we augment the graph by associating one node to each single user. Users are linked to the queries which they have issued and to the documents they have visited. The resulting data structure is a complete representation of the collective search activity performed by the users of a search engine or of an Intranet. The experimental results show that this more powerful representation can be successfully used to improve the quality of query clustering and to discover query suggestions.
  • Keywords
    Clustering algorithms; Conferences; Data structures; Feedback; Information retrieval; Intelligent agent; Search engines; Web and internet services; Web mining; Web search; Query Clustering; Query Recommendations; Query Suggestions; User Feedback; Web Logs;
  • fLanguage
    English
  • Publisher
    iet
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT '09. IEEE/WIC/ACM International Joint Conferences on
  • Conference_Location
    Milan, Italy
  • Print_ISBN
    978-0-7695-3801-3
  • Electronic_ISBN
    978-1-4244-5331-3
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2009.41
  • Filename
    5286068