Title :
Diacritization, automatic segmentation and labeling for Levantine Arabic speech
Author :
Alotaibi, Yousef A. ; Meftah, Ali H. ; Selouani, Sid-Ahmed
Author_Institution :
Coll. of Comput. & Inf. Sci., King Saud Univ., Riyadh, Saudi Arabia
Abstract :
It is generally acknowledged that a reliable speech corpus is necessary for any application involving speech processing. In this paper, we propose methods to improve the BBN/AUB DARPA Babylon Levantine Arabic speech corpus to increase its reliability and efficiency. For this purpose, correction of pronunciation, diacritization, and new transcription are performed manually along with automatic phoneme segmentation and labeling. The comparison with the original transcription of the corpus shows a clear improvement in the output results.
Keywords :
natural language processing; speech processing; BBN-AUB DARPA Babylon Levantine Arabic speech corpus; automatic phoneme labeling; automatic phoneme segmentation; diacritization correction; pronunciation correction; speech processing; transcription; Educational institutions; Hidden Markov models; Labeling; Reliability; Speech; Speech processing; Speech recognition; BBN/AUB; Levantine; diacritics; dialect; transcription;
Conference_Titel :
Digital Signal Processing and Signal Processing Education Meeting (DSP/SPE), 2013 IEEE
Conference_Location :
Napa, CA
Print_ISBN :
978-1-4799-1614-6
DOI :
10.1109/DSP-SPE.2013.6642556