• DocumentCode
    3099493
  • Title

    Similarity Computation of Web Pages

  • Author

    Shi, Peng ; Ding, Lianhong ; Liu, Bingwu

  • Author_Institution
    Sci. Center for Mater. Service Safety, Univ. of Sci. & Technol. Beijing, Beijing
  • fYear
    2008
  • fDate
    21-22 Dec. 2008
  • Firstpage
    777
  • Lastpage
    780
  • Abstract
    Web page is the main contents on the World Wide Web. Similarity of Web pages is very helpful for Web content analysis. Text similarity, usually called similarity computation, has been investigated for decades in artificial intelligence area. Some similarity computation methods have been used to compare Web pages. However, text based similarity computation methods are incompetent for Web page comparing, because Web page consists of not only text but also multimedia contents, such as image, audio, video and so on. This paper proposes a new approach to evaluate the similarity of Web pages considering all the contents on them. It can make Web page similarity computation exactly and bring benefits for Web analysis.
  • Keywords
    Internet; Web sites; multimedia computing; Web analysis; Web content analysis; Web pages; World Wide Web; multimedia contents; similarity computation; text similarity; Aggregates; Artificial intelligence; Frequency; Indexing; Large scale integration; Marine safety; Materials science and technology; Nonhomogeneous media; Web pages; Web sites; Web page; multimedia contents; similarity computation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Knowledge Acquisition and Modeling Workshop, 2008. KAM Workshop 2008. IEEE International Symposium on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-1-4244-3530-2
  • Electronic_ISBN
    978-1-4244-3531-9
  • Type

    conf

  • DOI
    10.1109/KAMW.2008.4810606
  • Filename
    4810606