Title :
Arabic letters corpus based Malay speaker-independent
Author :
Almisreb, Ali Abd ; Abidin, Ahmad Farid ; Md Tahir, Nooritawati
Author_Institution :
Fac. of Electr. Eng., Univ. Teknol. MARA, Shah Alam, Malaysia
Abstract :
Arabic language is used as a second language by a wide category of Muslims for reciting the Holy book of Muslims (Qur´an). In this paper, a description of an effective and usable Arabic letters corpus uttered by Malay speakers. This corpus can be used to study the properties and the differences of pronunciations for non native. The designed corpus consists of 1400 samples recorded by 50 Malay individuals (25 males and 25 females). The corpus is recorded using low sensitive device with Zero-Crossing Rate used for removing the noise and sustained only the significant portion of speech signal with 11025 Hz as sampling rate. This database will be the pioneer corpuses database in speech recognition specifically for Malay community.
Keywords :
natural language processing; signal denoising; speech recognition; Arabic language pronunciation difference; Arabic language properties; Arabic letter corpus-based Malay speaker-independent; Islamic holy book recitation; Muslims; Qur´an recitation; corpus database; low-sensitive device; noise removal; sampling rate; second language; speech recognition; speech signal; zero-crossing rate; Conferences; Databases; Microphones; Noise; Speech; Speech recognition; Systems engineering and theory; Arabic language; Malay speakers; Matlab; Zero Cross Rate; corpus;
Conference_Titel :
System Engineering and Technology (ICSET), 2013 IEEE 3rd International Conference on
Conference_Location :
Shah Alam
Print_ISBN :
978-1-4799-1028-1
DOI :
10.1109/ICSEngT.2013.6650176