DocumentCode
394232
Title
Towards automatic transcription of large spoken archives - English ASR for the MALACH project
Author
Ramabhadran, Bhuvana ; Huang, Jing ; Picheny, Michael
Author_Institution
Dept. of Human Language Technol., IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
Volume
1
fYear
2003
fDate
6-10 April 2003
Abstract
Digital archives have emerged as the pre-eminent method for capturing the human experience. Before such archives can be used efficiently, their contents must be described. The NSF-funded MALACH project aims to provide improved access to large spoken archives by advancing the state-of-the-art in automated speech recognition (ASR), Information Retrieval (IR) and related technologies [1,2] for multiple languages. This paper describes the ASR research for the English speech in the MALACH corpus. The MALACH corpus consists of unconstrained, natural speech filled with disfluencies, heavy accents, age-related coarticulation, uncued speaker and language switching, and emotional speech collected in the form of interviews from over 52000 speakers in 32 languages. In this paper, we describe this new testbed for developing speech recognition algorithms and report on the performance of well-known techniques for building better acoustic models for the speaking styles seen in this corpus. The best English ASR system to date has a word error rate of 43.8% on this corpus.
Keywords
natural languages; records management; speech recognition; ASR; English; MALACH project; accents; acoustic models; age-related coarticulation; automated speech recognition; automatic transcription; disfluencies; emotional speech; interviews; large spoken archives; natural speech; speaking styles; word error rate; Acoustic testing; Automatic speech recognition; Error analysis; History; Humans; Information retrieval; Information technology; Loudspeakers; Natural languages; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-7663-3
Type
conf
DOI
10.1109/ICASSP.2003.1198756
Filename
1198756
Link To Document