Title :
Populating ConceptNet Knowledge Base with Information Acquired from Japanese Wikipedia
Author :
Marek Krawczyk;Rafal Rzepka;Kenji Araki
Author_Institution :
Hokkaido Univ., Sapporo, Japan
Abstract :
This paper presents a method of acquiring IsA assertions (hyponymy relations), AtLocation assertions (informing of location of objects) and Located Near assertions (informing of neigh boring locations) automatically from Japanese Wikipedia XML dump files. To extract IsA assertions, we use the Hyponymy extraction tool v1.0, which analyses definition, category and hierarchy structures of Wikipedia articles. The tool also produces information-rich taxonomy from which, using our original method, we can extract additional information, in this case AtLocation and Located Near type of assertions. Experiments showed that both methods produce positive results: we were able to acquire 5,866,680 IsA assertions with 99.0% reliability, 131,760 AtLocation assertion pairs with 93.0% reliability and 6,217 Located Near assertion pairs with 99.0% reliability. Our method exceeded the baseline system considering both precision and the number of acquired assertions.
Keywords :
"Internet","Encyclopedias","Electronic publishing","Knowledge based systems","Reliability","Cities and towns"
Conference_Titel :
Systems, Man, and Cybernetics (SMC), 2015 IEEE International Conference on
DOI :
10.1109/SMC.2015.519