Title of article :
Lexical and Semantic Clustering by Web Links
Author/Authors :
Filippo Menczer، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2004
Abstract :
Recent Web-searching and -mining tools are combining
text and link analysis to improve ranking and crawling
algorithms. The central assumption behind such approaches
is that there is a correlation between the graph
structure of the Web and the text and meaning of pages.
Here I formalize and empirically evaluate two general
conjectures drawing connections from link information
to lexical and semantic Web content. The link-content
conjecture states that a page is similar to the pages that
link to it, and the link-cluster conjecture that pages
about the same topic are clustered together. These conjectures
are often simply assumed to hold, and Web
search tools are built on such assumptions. The present
quantitative confirmation sheds light on the connection
between the success of the latest Web-mining
techniques and the small world topology of the Web,
with encouraging implications for the design of better
crawling algorithms.
Journal title :
Journal of the American Society for Information Science and Technology
Journal title :
Journal of the American Society for Information Science and Technology