DocumentCode :
2348459
Title :
Comparative evaluation of two arabic speech corpora
Author :
Alotaibi, Yousef Ajami ; Meftah, Ali Hamid
Author_Institution :
Comput. Eng. Dept., King Saud Univ., Riyadh, Saudi Arabia
fYear :
2010
fDate :
21-23 Aug. 2010
Firstpage :
1
Lastpage :
5
Abstract :
The aim of this paper is to conduct a constructive and comparative evaluation between two important Arabic corpora for two different Arabic dialects, namely, Saudi dialect corpus that was collected by King Abdulaziz City for Science and Technology (KACST), and a Levantine Arabic dialect corpus. Levantine dialect is spoken by ordinary Lebanese, Jordanian, Syrian, and Palestinian people. The later one was produced by the Linguistic Data Consortium (LDC). Advantages and disadvantages of these two corpora were presented and discussed. This discussion is aiming to help digital speech processing researchers to figure out the weakness and strength sides of these important corpora before considering them in their experiments. Moreover, this paper can motivate in designing, maintaining, distributing, and upgrading Arabic corpora to help Arabic language speech research communities.
Keywords :
natural language processing; speech processing; Arabic dialects; Arabic speech corpora; Levantine Arabic dialect corpus; Saudi dialect corpus; digital speech processing; linguistic data consortium; Databases; Modulation; Noise measurement; Phase change materials; Speech; Vocabulary; XML; Arabic; BBN/AUB; Levantine; MSA; SAAVB; Speech;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering (NLP-KE), 2010 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-6896-6
Type :
conf
DOI :
10.1109/NLPKE.2010.5587819
Filename :
5587819
Link To Document :
بازگشت