DocumentCode :
1796903
Title :
Assessing the quality of digital re-publishing of textual documents through the follow-up of a correction protocol by crowdsourcing
Author :
Lagarrigue, Marthe ; Rossant, F. ; Pierrot, Alain ; Gardes, Joel ; Maldivi, Christophe ; Petit, Eric
Author_Institution :
ISEP, Inst. Super. d´Electron. de Paris, Paris, France
fYear :
2014
fDate :
1-2 Nov. 2014
Firstpage :
1
Lastpage :
5
Abstract :
Digitized re-publishing of documents has become nowadays a very important issue. Optical Character Recognition (OCR) has been intensively used to this aim, as it performs the transcription of the text images into electronic files, allowing display functionalities, indexation, enrichment and broadcasting. However, such software still fails in many configurations, so that the transcription does not reach the required editorial quality (99% of recognition are required for an ergonomic reading). In the OZALID project, we propose to rely on crowdsourcing for correcting OCR results. One main issue is then to determine when the crowdsourcing has reached its limits. For that, we present a feasibility study of an original protocol based on indicators that quantify the recognition quality in both semantic and semiotic ways. These indicators are calculated and followed up during the entire crowdsourcing process until stability. Experimental results show that the proposed observables converge after some correction iterations allowing automatically stopping the crowdsourcing process and dealing with huge amount of data.
Keywords :
document handling; electronic publishing; quality management; OCR; correction protocol; crowdsourcing process; digital re-publishing; optical character recognition; quality assessment; recognition quality; textual documents; Crowdsourcing; Optical character recognition software; Protocols; Quality assessment; Semantics; Shape; OCR; correction protocol; crowdsourcing; digital edition; quality assessment; semantics; semiotics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence for Multimedia Understanding (IWCIM), 2014 International Workshop on
Conference_Location :
Paris
Type :
conf
DOI :
10.1109/IWCIM.2014.7008814
Filename :
7008814
Link To Document :
بازگشت