• DocumentCode
    263849
  • Title

    Perceptual MVDR-based unsupervised built-in speaker normalization for Kazakh speech recognition

  • Author

    Yessenbayev, Zhandos ; Yapanel, Umit

  • Author_Institution
    Nazarbayev Univ. Res. & Innovation Syst., Astana, Kazakhstan
  • fYear
    2014
  • fDate
    15-17 Oct. 2014
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    In this work we present a novel approach to unsupervised speaker normalization on top of the Perceptual MVDR-based Built-in Speaker Normalization technique. We showed that the proposed method can be efficient for the task of phonetic recognition on TIMIT and then applied it to Kazakh speech recognition. From the experiments, we see that this method is able to improve the relative performance of ASR systems up to 20% The analysis of the optimal warp factor selection by the algorithm revealed a nice gender separation ability which may be used for gender/speaker classification tasks.
  • Keywords
    natural language processing; speech recognition; ASR systems; Kazakh speech recognition; TIMIT; gender classification tasks; gender separation ability; optimal warp factor selection; perceptual MVDR-based unsupervised built-in speaker normalization; phonetic recognition; speaker classification tasks; Acoustics; Algorithm design and analysis; Feature extraction; Hidden Markov models; Speech; Speech recognition; Training; Kazakh speech recognition; Unsupervised speaker normalization; phone recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Application of Information and Communication Technologies (AICT), 2014 IEEE 8th International Conference on
  • Conference_Location
    Astana
  • Print_ISBN
    978-1-4799-4120-9
  • Type

    conf

  • DOI
    10.1109/ICAICT.2014.7035914
  • Filename
    7035914