• DocumentCode
    2279221
  • Title

    Japanese phonetic feature extraction for automatic speech recognition

  • Author

    Banik, Manoj ; Eity, Qamrun Nahar ; Lisa, Nusrat Jahan ; Hassan, Foyzul ; Saha, Aloke Kumar ; Huda, Mohammad Nurul

  • Author_Institution
    Dept. of CSE, Ahsanullah Univ. of Sci. & Technol., Dhaka, Bangladesh
  • fYear
    2010
  • fDate
    15-17 Dec. 2010
  • Firstpage
    143
  • Lastpage
    147
  • Abstract
    This paper presents a method for extracting distinctive phonetic features (DPFs) for automatic speech recognition (ASR). The method comprises three stages: i) a acoustic feature extractor, ii) a multilayer neural network (MLN) and iii) a hidden Markov model (HMM) based classifier. At first stage, acoustic features, local features (LFs), are extracted from input speech. On the other stage, MLN generates a 45-dimentional DPF vector from the LFs of 75- dimentions. Finally, these 45-dimentional DPF vector is inserted into an HMM-based classifier to obtain phoneme strings. From the experiments on Japanese Newspaper Article Sentences (JNAS), it is observed that the proposed DPF extractor provides a higher phoneme correct rate and accuracy with fewer mixture components in the HMMs compared to the method based on mel frequency cepstral coefficients (MFCCs). Moreover, a higher correct rate for each phonetic feature is obtained using the proposed method.
  • Keywords
    acoustic signal processing; feature extraction; hidden Markov models; multilayer perceptrons; natural language processing; speech processing; speech recognition; Japanese Newspaper Article Sentences; Japanese phonetic feature extraction; acoustic feature extractor; automatic speech recognition; hidden Markov model based classifier; multilayer neural network; Artificial neural networks; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Speech; Speech recognition; automatic speech recognition; distinctive phonetic features; hidden Markov model; mel frequency cepstral coefficient; multilayer neural network;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal and Image Processing (ICSIP), 2010 International Conference on
  • Conference_Location
    Chennai
  • Print_ISBN
    978-1-4244-8595-6
  • Type

    conf

  • DOI
    10.1109/ICSIP.2010.5697458
  • Filename
    5697458