• شماره ركورد كنفرانس
    1730
  • عنوان مقاله

    Feature Bandwidth Extension for Persian Conversational Telephone Speech Recognition

  • عنوان به زبان ديگر
    Feature Bandwidth Extension for Persian Conversational Telephone Speech Recognition
  • پديدآورندگان

    Goodarzi Mohammad Mohsen نويسنده , Almasganj Farshad نويسنده , Kabudian Jahanshah نويسنده , Shekofteh Yasser نويسنده , Sarraf Rezaei Iman نويسنده

  • تعداد صفحه
    4
  • كليدواژه
    feature bandwidth extension , Estimation theory , Neural network , conversational telephony speech recognition , Gaussian processes , Speaker Recognition , Gaussian Mixture Model , Neural nets
  • سال انتشار
    2012
  • عنوان كنفرانس
    بيستمين كنفرانس مهندسي برق ايران
  • زبان مدرك
    فارسی
  • چكيده لاتين
    Configuring a whole setup with application of continuous conversational telephony speech recognition in Persian is the goal of this paper. For this propose, two commonmethods, Gaussian Mixture Model (GMM) and Neural Network (NN) and a proposed hybrid GMM-NN method have been considered to estimate full-bandwidth features from band-limitedfeatures. Performances of these methods have been evaluated with two different spectral and cepstral based features, LFBEand MFCC. Also, the effect of speaker gender in estimation process has been investigated. Our results showed that bestphoneme recognition accuracy is obtained when MFCC features are reconstructed using two gender dependent neural networks.In this configuration, phoneme accuracy was about 1.6 % more than baseline. The tests were applied on TFarsDat corpus
  • شماره مدرك كنفرانس
    4460809
  • سال انتشار
    2012
  • از صفحه
    1
  • تا صفحه
    4
  • سال انتشار
    2012