• DocumentCode
    226408
  • Title

    The optimization of PLP feature extraction for LVCSR recognition of MP3 data

  • Author

    Borsky, Michal ; Pollak, Petr

  • Author_Institution
    Fac. of Electr. Eng., Czech Tech. Univ. in Prague, Prague, Czech Republic
  • fYear
    2014
  • fDate
    9-10 Sept. 2014
  • Firstpage
    55
  • Lastpage
    58
  • Abstract
    This paper analyses the contribution of optimized PLP feature extraction setup and application of feature normalization to improve the performance of automatic speech recognition system for data compressed by MP3 algorithm. The experimental study performed on loop-digit recognition and large vocabulary continues speech recognition task showed that proper setup can negate the effect of lower compression rates which can achieve results comparable with higher rates. The second finding is that the normalization techniques contribute significantly to overall performance, especially for shorter windows/shifts and lower compression rates. The acoustic models trained on 160kbits/s, 32kbits/s and 16kbits/s data performed at 34.17%, 41.88% and 36.4% WER respectively on LVCSR task. In comparison the non-compressed acoustic models performed at 28.56% WER.
  • Keywords
    acoustic signal processing; data compression; feature extraction; speech coding; speech recognition; vocabulary; MP3 algorithm; MP3 data LVCSR recognition; PLP feature extraction optimization; WER; automatic speech recognition system; data compression; feature normalization technique; loop-digit recognition; noncompressed acoustic model; word error rate; Acoustic distortion; Acoustics; Digital audio players; Feature extraction; Speech; Speech recognition; LVCSR; MP3 compression; PLP features; acoustic modelling; cepstral mean and variance normalization; cepstral mean normalization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Applied Electronics (AE), 2014 International Conference on
  • Conference_Location
    Pilsen
  • ISSN
    1803-7232
  • Print_ISBN
    978-8-0261-0276-2
  • Type

    conf

  • DOI
    10.1109/AE.2014.7011667
  • Filename
    7011667