• Title of article

    Improving the performance of MFCC for Persian robust speech recognition

  • Author/Authors

    Darabian، D نويسنده Department of Electrical Engineering, University of Shahrood, Shahrood, Iran. , , Marvi ، H نويسنده Department of Electrical Engineering, University of Shahrood, Shahrood, Iran. , , Sharif Noughabi، M نويسنده Department of Electrical Engineering, University of Shahrood, Shahrood, Iran ,

  • Issue Information
    دوفصلنامه با شماره پیاپی 0 سال 2015
  • Pages
    8
  • From page
    149
  • To page
    156
  • Abstract
    ضرايب مل-كپستروم يكي از رايج ترين ويژگي ها در سيستم هاي تشخيص گفتار مي باشند. اين ضرايب در عين قدرت بالا در به كارگيري در سيستم هاي تشخيص گفتار، بسيار به نويز حساس هستند. در اين مقاله ما يك روش مقاوم به نويز براي استخراج اين ضرايب پيشنهاد نموده ايم كه شامل استفاده از چند بلوك جبران گر، همچنين تغيير در چند بلوك در الگوريتم پايه مي باشد. در بخش آزمايش روش پيشنهادي از شبكه عصبي استفاده شده است. آزمايش هاي تشخيص گفتار صورت گرفته نشان دهنده بهبود نرخ تشخيص گفتار در محيط نويزي، نسبت به ساير روش ها در استخراج اين الگوريتم هستند.
  • Abstract
    The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to the noisy original speech signal. The pre-emphasized original speech segmented into overlapping time frames, then it is windowed by a modified hamming window .Higher order autocorrelation coefficients are extracted. The next step is to eliminate the lower order of the autocorrelation coefficients. The consequence pass from FFT block and then power spectrum of output is calculated. A Gaussian shape filter bank is applied to the results. Logarithm and two compensator blocks form which one is mean subtraction and the other one are root block applied to the results and DCT transformation is the last step. We use MLP neural network to evaluate the performance of proposed MFCC method and to classify the results. Some speech recognition experiments for various tasks indicate that the proposed algorithm is more robust than traditional ones in noisy condition.
  • Journal title
    Journal of Artificial Intelligence and Data Mining
  • Serial Year
    2015
  • Journal title
    Journal of Artificial Intelligence and Data Mining
  • Record number

    2387990