DocumentCode :
2652381
Title :
Bootstrapping Multilingual Relation Discovery Using English Wikipedia and Wikimedia-Induced Entity Extraction
Author :
Schone, Patrick ; Allison, Tim ; Giannella, Chris ; Pfeifer, Craig
Author_Institution :
FamilySearch, Salt Lake City, UT, USA
fYear :
2011
fDate :
7-9 Nov. 2011
Firstpage :
944
Lastpage :
951
Abstract :
Relation extraction has been a subject of significant study over the past decade. Most relation extractors have been developed by combining the training of complex computational systems on large volumes of annotations with extensive rule writing by language experts. Moreover, many relation extractors are reliant on other non-trivial NLP technologies which themselves are developed through significant human efforts, such as entity tagging, parsing, etc. Due to the high cost of creating and assembling the required resources, relation extractors have typically been developed for only high-resourced languages. In this paper, we describe a near-zero-cost methodology to build relation extractors for significantly distinct non-English languages using only freely available Wikipedia and other web documents, and some knowledge of English. We apply our methodology and build alma-mater, birthplace, father, occupation, and spouse relation extractors in Greek, Spanish, Russian, and Chinese. We conduct evaluations of induced relations at the file level which are the most refined we have seen in the literature.
Keywords :
Web sites; computational complexity; document handling; natural language processing; statistical analysis; Chinese; English Wikipedia; Greek; Russian; Spanish; Web documents; Wikimedia induced entity extraction; complex computational systems; entity tagging; language experts; multilingual relation discovery bootstrapping; near-zero-cost methodology; nonEnglish languages; nontrivial NLP technologies; parsing; relation extraction; Artificial intelligence; Conferences; Wikipedia; multilingual relation extraction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence (ICTAI), 2011 23rd IEEE International Conference on
Conference_Location :
Boca Raton, FL
ISSN :
1082-3409
Print_ISBN :
978-1-4577-2068-0
Electronic_ISBN :
1082-3409
Type :
conf
DOI :
10.1109/ICTAI.2011.163
Filename :
6103454
Link To Document :
بازگشت