DocumentCode
3529623
Title
Relation discovery by named entity recognition from Tibetan websites
Author
Yu, Hongzhi ; Jiang, Tao ; Zhang, Bing ; Chen, Xinyi
Author_Institution
State Key Lab. of Nat. Languages Inf. Technol., Northwest Univ. for Nat., Lanzhou, China
fYear
2009
fDate
23-24 Aug. 2009
Firstpage
177
Lastpage
179
Abstract
Discovering the significant relations embedded in the Web pages would be very useful for community discovery. In this paper, we propose an unsupervised method for relation discovery from Tibetan Web pages, which is based on co-occurrences of named entities in the pages. In order to find the relation, a rule-based named entity recognition algorithm has been proposed. Our experiment shows that the algorithm has got a high precision and recall by using 30.2 megabyte plain text from three large Tibetan Web sites. And we also give a relation strength formula combining the co-occurrence frequency and personal information, thus the relation in the Web pages can easily be fond according to the value of the relation strength.
Keywords
data mining; information retrieval; natural language processing; social networking (online); unsupervised learning; Tibetan Web pages; Tibetan Web sites; community network discovery; information retrieval; named entity co-occurrence frequency; personal information; relation discovery; relation strength formula; rule-based named entity recognition algorithm; social network discovery; unsupervised method; Data processing; Encoding; Frequency; Information technology; Internet; Java; Laboratories; Libraries; Markup languages; Web pages;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Society, 2009. SWS '09. 1st IEEE Symposium on
Conference_Location
Lanzhou
Print_ISBN
978-1-4244-4157-0
Electronic_ISBN
978-1-4244-4158-7
Type
conf
DOI
10.1109/SWS.2009.5271789
Filename
5271789
Link To Document