DocumentCode :
659444
Title :
Robust crowdsourced learning
Author :
Zhiquan Liu ; Luo Luo ; Wu-Jun Li
Author_Institution :
Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., Shanghai, China
fYear :
2013
fDate :
6-9 Oct. 2013
Firstpage :
338
Lastpage :
343
Abstract :
In general, a large amount of labels are needed for supervised learning algorithms to achieve satisfactory performance. It´s typically very time-consuming and money-consuming to get such kind of labeled data. Recently, crowdsourcing services provide an effective way to collect labeled data with much lower cost. Hence, crowdsourced learning (CL), which performs learning with labeled data collected from crowdsourcing services, has become a very hot and interesting research topic in recent years. Most existing CL methods exploit only the labels from different workers (annotators) for learning while ignoring the attributes of the instances. In many real applications, the attributes of the instances are actually the most discriminative information for learning. Hence, CL methods with attributes have attracted more and more attention from CL researchers. One representative model of such kind is the personal classifier (PC) model, which has achieved the state-of-the-art performance. However, the PC model makes an unreasonable assumption that all the workers contribute equally to the final classification. This contradicts the fact that different workers have different quality (ability) for data labeling. In this paper, we propose a novel model, called robust personal classifier (RPC), for robust crowdsourced learning. Our model can automatically learn an expertise score for each worker. This expertise score reflects the inherent quality of each worker. The final classifier of our RPC model gives high weights for good workers and low weights for poor workers or spammers, which is more reasonable than PC model with equal weights for all workers. Furthermore, the learned expertise score can be used to eliminate spammers or low-quality workers. Experiments on simulated datasets and UCI datasets show that the proposed model can dramatically outperform the baseline models such as PC model in terms of classification accuracy and ability to detect spammers.
Keywords :
learning (artificial intelligence); pattern classification; ubiquitous computing; CL; PC; RPC; UCI datasets; classification accuracy; crowdsourcing services; data labeling; expertise score; final classification; labeled data; learned expertise score; low-quality workers; personal classifier model; robust crowdsourced learning; robust personal classifier; simulated datasets; spammers; supervised learning algorithms; Accuracy; Labeling; Logistics; Noise measurement; Optimization; Robustness; Training; crowdsourced learning; crowdsourcing; supervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data, 2013 IEEE International Conference on
Conference_Location :
Silicon Valley, CA
Type :
conf
DOI :
10.1109/BigData.2013.6691593
Filename :
6691593
Link To Document :
بازگشت