DocumentCode :
3703521
Title :
From one star to three stars: Upgrading legacy open data using crowdsourcing
Author :
Satoshi Oyama;Yukino Baba;Ikki Ohmukai;Hiroaki Dokoshi;Hisashi Kashima
Author_Institution :
Graduate School of Information Science and Technology, Hokkaido University, Kita 14, Nishi 9, Kita-ku, Sapporo, Hokkaido 060-0814, Japan
fYear :
2015
Firstpage :
1
Lastpage :
9
Abstract :
Despite recent open data initiatives in many countries, a significant percentage of the data provided is in non-machine-readable formats like image format rather than in a machine-readable electronic format, thereby restricting their usability. This paper describes the first unified framework for converting legacy open data in image format into a machine-readable and reusable format by using crowdsourcing. Crowd workers are asked not only to extract data from an image of a chart but also to reproduce the chart objects in spreadsheets. The properties of the reconstructed chart objects give their data structures including series names and values, which are useful for automatic processing of data by computer. Since results produced by crowdsourcing inherently contain errors, a quality control mechanism was developed that improves the accuracy of extracted tables by aggregating tables created by different workers for the same chart image and by utilizing the data structures obtained from the reproduced chart objects. Experimental results demonstrated that the proposed framework and mechanism are effective.
Keywords :
"Crowdsourcing","Data mining","Resource description framework","Software","Computers","Data structures","Licenses"
Publisher :
ieee
Conference_Titel :
Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on
Print_ISBN :
978-1-4673-8272-4
Type :
conf
DOI :
10.1109/DSAA.2015.7344801
Filename :
7344801
Link To Document :
بازگشت