Pronunciation modeling for dialectal arabic speech recognition

Author

Al-Haj, Hassan ; Hsiao, Roger ; Lane, Ian ; Black, Alan W. ; Waibel, Alex

Author_Institution

Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA

fYear

2009

fDate

Nov. 13 2009-Dec. 17 2009

Firstpage

525

Lastpage

528

Abstract

Short vowels in Arabic are normally omitted in written text which leads to ambiguity in the pronunciation. This is even more pronounced for dialectal Arabic where a single word can be pronounced quite differently based on the speaker´s nationality, level of education, social class and religion. In this paper we focus on pronunciation modeling for Iraqi-Arabic speech. We introduce multiple pronunciations into the Iraqi speech recognition lexicon, and compare the performance, when weights computed via forced alignment are assigned to the different pronunciations of a word. Incorporating multiple pronunciations improved recognition accuracy compared to a single pronunciation baseline and introducing pronunciation weights further improved performance. Using these techniques an absolute reduction in word-error-rate of 2.4% was obtained compared to the baseline system.

Keywords

linguistics; natural language processing; speech recognition; word processing; Iraqi-Arabic speech; dialectal speech recognition; education level; pronunciation modeling; pronunciation weights; short vowels; speaker nationality; speaker religion; speaker social class; word-error rate; written text; Automatic speech recognition; Books; Computer science; Context; Decoding; Dictionaries; Government; Natural languages; Predictive models; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on

Conference_Location

Merano

Print_ISBN

978-1-4244-5478-5

Electronic_ISBN

978-1-4244-5479-2

Type

conf

DOI

10.1109/ASRU.2009.5373245

Filename

5373245