• DocumentCode
    3515541
  • Title

    A Web Mining Model for Real-time Webpage Personalization

  • Author

    Shen Hui-zhang ; Ji-di, Zhao ; Zhong-zhi, Yang

  • Author_Institution
    Inst. of Syst. Eng., Shanghai Jiao Tong Univ.
  • fYear
    2006
  • fDate
    5-7 Oct. 2006
  • Firstpage
    8
  • Lastpage
    12
  • Abstract
    Determining the size of the World Wide Web is extremely difficult. The Web can be viewed as the largest data source available and presents a challenging task for effective design and access. One proposed Web mining approach to handling the problem of effective design and access is personalization. With personalization, Web access or the contents of a Web page are modified to better fit the desires of the user. This may involve dynamically creating Web pages that are unique per user or using the desires of a user to determine what Web documents to retrieve. This paper presents a Web mining model based on dynamic clustering and hidden Markov model. The output of the model is some information for dynamically creating a Web page which can best meet the user´s desires. The assumption of the dynamic clustering is that if a group of users who have the same interest trend, those pages they have visited are probably related. We propose that human should be the authority to judge the correlation of two pages. First, the model statistic a user´s Web browsing records in the log file; find a group of users who have the same interest trend with the user; collect all the pages in which this group of users are interested; calculate the correlation between pages; and cluster the pages into several categories according to a predetermined threshold. Each Web page category is considered as a stochastic state variable. In the second phase, our model based on hidden Markov model is further constructed to mine the latent desires of a user given an observed sequence of Web pages that the user have browsed. In order to get the optimal parameters (transition probability matrix, the conditional probability and the initial state) in the model, we applied the Baum-Welch parameter estimation method in EM algorithm to train the model on the data set. Experimental results show that the model is practicable and efficient
  • Keywords
    Internet; data mining; expectation-maximisation algorithm; hidden Markov models; pattern clustering; Web document; Web mining model; World Wide Web; dynamic clustering; expectation-maximisation algorithm; hidden Markov model; real-time Web page personalization; Clustering algorithms; Engineering management; Hidden Markov models; Humans; Parameter estimation; Statistics; Stochastic processes; Web mining; Web pages; Web sites; Dynamic clustering; Hidden Markov model; Web mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Management Science and Engineering, 2006. ICMSE '06. 2006 International Conference on
  • Conference_Location
    Lille
  • Print_ISBN
    7-5603-2355-3
  • Type

    conf

  • DOI
    10.1109/ICMSE.2006.313915
  • Filename
    4104858