• DocumentCode
    3529623
  • Title

    Relation discovery by named entity recognition from Tibetan websites

  • Author

    Yu, Hongzhi ; Jiang, Tao ; Zhang, Bing ; Chen, Xinyi

  • Author_Institution
    State Key Lab. of Nat. Languages Inf. Technol., Northwest Univ. for Nat., Lanzhou, China
  • fYear
    2009
  • fDate
    23-24 Aug. 2009
  • Firstpage
    177
  • Lastpage
    179
  • Abstract
    Discovering the significant relations embedded in the Web pages would be very useful for community discovery. In this paper, we propose an unsupervised method for relation discovery from Tibetan Web pages, which is based on co-occurrences of named entities in the pages. In order to find the relation, a rule-based named entity recognition algorithm has been proposed. Our experiment shows that the algorithm has got a high precision and recall by using 30.2 megabyte plain text from three large Tibetan Web sites. And we also give a relation strength formula combining the co-occurrence frequency and personal information, thus the relation in the Web pages can easily be fond according to the value of the relation strength.
  • Keywords
    data mining; information retrieval; natural language processing; social networking (online); unsupervised learning; Tibetan Web pages; Tibetan Web sites; community network discovery; information retrieval; named entity co-occurrence frequency; personal information; relation discovery; relation strength formula; rule-based named entity recognition algorithm; social network discovery; unsupervised method; Data processing; Encoding; Frequency; Information technology; Internet; Java; Laboratories; Libraries; Markup languages; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Society, 2009. SWS '09. 1st IEEE Symposium on
  • Conference_Location
    Lanzhou
  • Print_ISBN
    978-1-4244-4157-0
  • Electronic_ISBN
    978-1-4244-4158-7
  • Type

    conf

  • DOI
    10.1109/SWS.2009.5271789
  • Filename
    5271789