Title :
Multi-Type Web Relation Extraction Based on Bootstrapping
Author :
Liu, Xiaojiang ; Yu, Nenghai
Author_Institution :
MOE-MS KeyLab of MCC, Univ. of Sci. & Technol. of China, Hefei, China
Abstract :
Web-scale relation extraction is crucial to building the Web people search engines. Previous extraction models, such as Snowball, focus only on single type extraction, while the real applications always require as many as possible types of relation. In this paper, we propose a novel Web-scale relation extraction framework Multi-Type Snowball (MultiSnowball). MultiSnowball targets at extracting multiple types of relation simultaneously while starts with one pattern. By adopting the general bootstrapping framework, MultiSnowball not only iteratively finds new relation tuples and extraction patterns, but also iteratively identifies new relation types. Patterns are shared during the simultaneous extraction process among all the types to get more relation tuple extractions. Empirical studies on real Web-scale data set show the effectiveness of MultiSnowball over the baseline and Snowball and the ability to identify accurate relation types.
Keywords :
Internet; computer bootstrapping; information retrieval; search engines; Web people search engines; Web-scale data set; general bootstrapping framework; multitype Snowball extraction models; multitype Web relation extraction; simultaneous extraction process; single type extraction; Data mining; Electronic publishing; Encyclopedias; Organizations; Pattern matching; Search engines; Web pages; bootstrapping; people search; relation extraction;
Conference_Titel :
Information Engineering (ICIE), 2010 WASE International Conference on
Conference_Location :
Beidaihe, Hebei
Print_ISBN :
978-1-4244-7506-3
Electronic_ISBN :
978-1-4244-7507-0
DOI :
10.1109/ICIE.2010.365