DocumentCode :
1833096
Title :
Diacritization, automatic segmentation and labeling for Levantine Arabic speech
Author :
Alotaibi, Yousef A. ; Meftah, Ali H. ; Selouani, Sid-Ahmed
Author_Institution :
Coll. of Comput. & Inf. Sci., King Saud Univ., Riyadh, Saudi Arabia
fYear :
2013
fDate :
11-14 Aug. 2013
Firstpage :
7
Lastpage :
11
Abstract :
It is generally acknowledged that a reliable speech corpus is necessary for any application involving speech processing. In this paper, we propose methods to improve the BBN/AUB DARPA Babylon Levantine Arabic speech corpus to increase its reliability and efficiency. For this purpose, correction of pronunciation, diacritization, and new transcription are performed manually along with automatic phoneme segmentation and labeling. The comparison with the original transcription of the corpus shows a clear improvement in the output results.
Keywords :
natural language processing; speech processing; BBN-AUB DARPA Babylon Levantine Arabic speech corpus; automatic phoneme labeling; automatic phoneme segmentation; diacritization correction; pronunciation correction; speech processing; transcription; Educational institutions; Hidden Markov models; Labeling; Reliability; Speech; Speech processing; Speech recognition; BBN/AUB; Levantine; diacritics; dialect; transcription;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Signal Processing and Signal Processing Education Meeting (DSP/SPE), 2013 IEEE
Conference_Location :
Napa, CA
Print_ISBN :
978-1-4799-1614-6
Type :
conf
DOI :
10.1109/DSP-SPE.2013.6642556
Filename :
6642556
Link To Document :
بازگشت