Title :
Generating information-rich taxonomy from Wikipedia
Author :
Yamada, Ichiro ; Hashimoto, Chikara ; Oh, Jong-Hoon ; Torisawa, Kentaro ; Kuroda, Kow ; Saeger, Stijn De ; Tsuchida, Masaaki ; Kazama, Junichi
Author_Institution :
MASTAR Project, Nat. Inst. of Inf. & Commun. Technol., Keihanna, Japan
Abstract :
Even though hyponymy relation acquisition has been extensively studied, “how informative such acquired hyponymy relations are” has not been sufficiently discussed. We found that the hypernyms in automatically acquired hyponymy relations were often too vague or ambiguous to specify the meaning of their hyponyms. For instance, hypernym work is vague and ambiguous in hyponymy relations work/Avatar and work/The Catcher in the Rye. In this paper, we propose a simple method of generating intermediate concepts of hyponymy relations that can make such (vague) hypernyms more specific. Our method generates such an information-rich hyponymy relation as work / work by film director / work by James Cameron / Avatar from the less informative relation work/Avatar. Furthermore, the generated relation work by film director/Avatar can be paraphrased into a new relation movie/Avatar. Experiments showed that our method successfully acquired 2,719,441 enriched hyponymy relations with one intermediate concept with 0.853 precision and another 6,347,472 hyponymy relations with 0.786 precision.
Keywords :
Internet; linguistics; James Cameron; Wikipedia; avatar; hypernym work; hyponymy relation acquisition; information rich taxonomy generation; informative relation work; Avatars; Cities and towns; Educational institutions; Electronic publishing; Encyclopedias; Internet;
Conference_Titel :
Universal Communication Symposium (IUCS), 2010 4th International
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-7821-7
DOI :
10.1109/IUCS.2010.5666764