• DocumentCode
    3231268
  • Title

    Page clustering using a distance based algorithm

  • Author

    Mojica, Jairo Andrés ; Rojas, Diego Alexander ; Gómez, Jonatan ; González, Fabio

  • Author_Institution
    Intelligent Syst. Res. Lab., Nat. Univ. of Colombia, Colombia
  • fYear
    2005
  • fDate
    31 Oct.-2 Nov. 2005
  • Abstract
    This paper presents an application of a clustering algorithm based on gravitational forces to the problem of Web page clustering in a dynamic environment. The proposed algorithm uses a modification of the gravitational algorithm proposed by Gomez et al. but using only the distance measures (a notion of space is not required). This approach is useful when similarities (and/or then distances) between pages can be defined and compute quickly, but the definition of a space is computationally expensive. Experiments with data representing real URL´s and sessions are performed, and a comparison with the incremental connected components algorithm, which has been previously used to solve this problem, is done.
  • Keywords
    Internet; Web sites; data mining; pattern clustering; URL; Web page clustering; data representation; distance based algorithm; dynamic environment; gravitational algorithm; Clustering algorithms; Content based retrieval; Data mining; Extraterrestrial measurements; Information management; Information retrieval; Intelligent systems; Statistical analysis; Web mining; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Congress, 2005. LA-WEB 2005. Third Latin American
  • Print_ISBN
    0-7695-2471-0
  • Type

    conf

  • DOI
    10.1109/LAWEB.2005.27
  • Filename
    1592381