• DocumentCode
    3644581
  • Title

    Improving web clustering through a new modeling for web documents

  • Author

    Ioan Agavriloaei;Adrian Alexandrescu;Mitică Craus

  • Author_Institution
    Faculty of Automatic Control and Computer Engineering, “
  • fYear
    2011
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    The constant and rapid growth of the Web complexity and the Web size generates new challenges regarding the approaches in efficient processing of Web searched results. Due to the dynamic Web content and the huge amount of information returned by search engines, it is necessary to find new methods and ways for better organizing and modelling the information spread on the Web. In this paper, we propose the structuring of the Web content as a hierarchical environment, taking into account the site content and structure, the HTML document structure and the term importance. Furthermore, we propose an effective partitional clustering algorithm for a Web site. The preliminary results prove the effectiveness of the new Web content representation and the accuracy of the Web clustering algorithm.
  • Keywords
    "Clustering algorithms","HTML","Vectors","Accuracy","Algorithm design and analysis","Internet","Partitioning algorithms"
  • Publisher
    ieee
  • Conference_Titel
    System Theory, Control, and Computing (ICSTCC), 2011 15th International Conference on
  • Print_ISBN
    978-1-4577-1173-2
  • Type

    conf

  • Filename
    6085702