• DocumentCode
    2615468
  • Title

    Deriving Link Context through Dependency Analysis

  • Author

    Jing, Tao ; Peng, Tao ; Zuo, Wanli

  • Author_Institution
    Coll. of Comput. Sci. & Technol., Ji Lin Univ., Changchun, China
  • fYear
    2009
  • fDate
    17-20 April 2009
  • Firstpage
    186
  • Lastpage
    190
  • Abstract
    Link context is a beneficial complement to the anchor text when we predict the topic of the target Web page. In this paper, we have defined the link context of the anchor text as a word set in which each word has dependency relationship with it. We have proposed an effective method for the extraction of the link context. Firstly, we have decomposed the whole sentence into some sub-clauses through dependency analysis of the sentence. Each sub-clause represents a semantic group. Secondly, we have found out the sub-clause set of each anchor. Finally, we have chosen one sub-clause(which contains the anchor text and meets the selection rule) from the sub-clause set as the link context of the anchor.To our best knowledge, it is the first time to derive link context by a NLP(natural language processing)technique, the dependency relationship analysis of sentence.The preliminary result has shown the quality of the link context obtained by this method has been significantly improved, and can fill up the deficiency of some heuristic methods based on the HTML structure of the Web page in the respect of text analysis.
  • Keywords
    Internet; hypermedia markup languages; natural language processing; text analysis; HTML structure; World Wide Web; anchor text; link context; natural language processing; sentence dependency analysis; target Web page; text analysis; Analytical models; Computer science; Computer science education; Educational institutions; Educational technology; HTML; Text analysis; Web mining; Web pages; Web sites; anchor text; natural language processing; parser;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Education Technology and Computer, 2009. ICETC '09. International Conference on
  • Conference_Location
    Singapore
  • Print_ISBN
    978-0-7695-3609-5
  • Type

    conf

  • DOI
    10.1109/ICETC.2009.66
  • Filename
    5169478