DocumentCode
3644581
Title
Improving web clustering through a new modeling for web documents
Author
Ioan Agavriloaei;Adrian Alexandrescu;Mitică Craus
Author_Institution
Faculty of Automatic Control and Computer Engineering, “
fYear
2011
Firstpage
1
Lastpage
6
Abstract
The constant and rapid growth of the Web complexity and the Web size generates new challenges regarding the approaches in efficient processing of Web searched results. Due to the dynamic Web content and the huge amount of information returned by search engines, it is necessary to find new methods and ways for better organizing and modelling the information spread on the Web. In this paper, we propose the structuring of the Web content as a hierarchical environment, taking into account the site content and structure, the HTML document structure and the term importance. Furthermore, we propose an effective partitional clustering algorithm for a Web site. The preliminary results prove the effectiveness of the new Web content representation and the accuracy of the Web clustering algorithm.
Keywords
"Clustering algorithms","HTML","Vectors","Accuracy","Algorithm design and analysis","Internet","Partitioning algorithms"
Publisher
ieee
Conference_Titel
System Theory, Control, and Computing (ICSTCC), 2011 15th International Conference on
Print_ISBN
978-1-4577-1173-2
Type
conf
Filename
6085702
Link To Document