DocumentCode
561157
Title
Improving Classifier Performance by Autonomously Collecting Background Knowledge from the Web
Author
Minton, Steven N. ; Michelson, Matthew ; See, Kane ; Macskassy, Sofus ; Gazen, Bora C. ; Getoor, Lise
Author_Institution
InferLink Corp, El Segundo, CA, USA
Volume
1
fYear
2011
fDate
18-21 Dec. 2011
Firstpage
1
Lastpage
6
Abstract
Many websites allow users to tag data items to make them easier to find. In this paper we consider the problem of classifying tagged data according to user-specified interests. We present an approach for aggregating background knowledge from the Web to improve the performance of a classier. In previous work, researchers have developed technology for extracting knowledge, in the form of relational tables, from semi-structured websites. In this paper we integrate this extraction technology with generic machine learning algorithms, showing that knowledge extracted from the Web can significantly benefit the learning process. Specifically, the knowledge can lead to better generalizations, reduce the number of samples required for supervised learning, and eliminate the need to retrain the system when the environment changes. We validate the approach with an application that classifies tagged Fickr data.
Keywords
Web sites; information retrieval; learning (artificial intelligence); pattern classification; Website; autonomous background knowledge collection; background knowledge aggregation; classifier performance improvement; generic machine learning algorithm; knowledge extraction technology; supervised learning; tagged Flickr data; tagged data classification; user-specified interest; Cities and towns; Data mining; Fires; Knowledge engineering; Monitoring; Portals; Training; Background Knowledge; Classifiers; Information Extraction; Ontologies; Web Harvesting;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
Conference_Location
Honolulu, HI
Print_ISBN
978-1-4577-2134-2
Type
conf
DOI
10.1109/ICMLA.2011.76
Filename
6146932
Link To Document