DocumentCode :
1890744
Title :
Maxout based deep neural networks for Arabic phonemes recognition
Author :
AbdAlmisreb, Ali ; Abidin, Ahmad Farid ; Md Tahir, Nooritawati
Author_Institution :
Fac. of Electr. Eng., Univ. Technologi Mara, Shah Alam, Malaysia
fYear :
2015
fDate :
6-8 March 2015
Firstpage :
192
Lastpage :
197
Abstract :
Arabic is widely articulated by Malay race due to several factors such as; performing worship and reciting the Holy book of Muslims. Newly, Maxout deep neural networks have conveyed substantial perfections to speech recognition systems. Hence, in this paper, a fully connected feed-forward neural network with Maxout units is introduced. The proposed deep neural network involves three hidden layers, 500 Maxout units and 2 neurons for each unit along with Mel-Frequency Cepstral Coefficients (MFCC) as feature extraction of the phonemes waveforms. Further, the deep neural network is trained and tested over a corpus comprised of consonant Arabic phonemes recorded from 20 Malay speakers. Each person is required to pronounce the twenty eight consonant phonemes within the three chances given to each subjects articulate all the letters. Conversely, continuous recording has been established to record all the letters in each chance. The recording process is accomplished using SAMSON C03U USB multi-pattern condenser microphone. Here, the data are divided into five waveforms for training the proposed Maxout network and fifteen waveforms for testing. Experimentally, the proposed Dropout function for training has shown considerable performance over Sigmoid and Rectified Linear Unit (ReLU) functions. Eventually, testing Maxout network has shown considerable outcome compare to Restricted Boltzmann Machine (RBM), Deep Belief Network (DBN), Convolutional Neural Network (CNN), the conventional feedforward neural network (NN) and Convolutional Auto-Encoder (CAE).
Keywords :
cepstral analysis; feature extraction; feedforward neural nets; natural language processing; speech recognition; Arabic phonemes recognition; MFCC; Malay race; Malay speakers; ReLU functions; SAMSON C03U USB multipattern condenser microphone; consonant Arabic phonemes; dropout function; feature extraction; fully connected feed-forward neural network; maxout based deep neural networks; maxout units; mel-frequency cepstral coefficients; phonemes waveforms; recording process; rectified linear unit; speech recognition systems; Acoustics; Biological neural networks; Error analysis; Hidden Markov models; Speech recognition; Training; Arabic; Convolutional Neural Network; Deep Belief Network; Deep learning; Maxout Networks;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing & Its Applications (CSPA), 2015 IEEE 11th International Colloquium on
Conference_Location :
Kuala Lumpur
Print_ISBN :
978-1-4799-8248-6
Type :
conf
DOI :
10.1109/CSPA.2015.7225644
Filename :
7225644
Link To Document :
بازگشت