DocumentCode :
2176881
Title :
Accurate transcription of broadcast news speech using multiple noisy transcribers and unsupervised reliability metrics
Author :
Audhkhasi, Kartik ; Georgiou, Panayiotis ; Narayanan, Shrikanth S.
Author_Institution :
Electr. Eng. Dept., Univ. of Southern California, Los Angeles, CA, USA
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
4980
Lastpage :
4983
Abstract :
Professional manual transcription of speech is an expensive and time consuming process. This paper focuses on the problem of combining noisy transcriptions from multiple non-expert transcribers, where the quality of work from each worker varies. Computing transcriber reliability is a difficult task in the absence of gold standard reference transcripts. Three simple metrics for quantifying this reliability without using a gold standard are proposed. We create a database of 1000 Mexican Spanish broadcast news audio clips transcribed by five transcribers each through Amazon Mechanical Turk. Combination of multiple noisy transcripts using these reliability scores improves the word error rate of the combined transcript with respect to the LDC gold standard by 8% relative, and the sentence error rate by 4.1% relative, when compared with a combination without any reliability information.
Keywords :
reliability; speech processing; Amazon mechanical turk; LDC gold standard; Mexican Spanish broadcast news audio clip transcription; broadcast news speech transcription; gold standard reference transcript; multiple noisy transcriber; multiple nonexpert transcriber; noisy transcription; sentence error rate; speech professional manual transcription; unsupervised reliability metric; Indexes; Speech transcription; crowd sourcing; evaluator reliability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5947474
Filename :
5947474
Link To Document :
بازگشت