• DocumentCode
    1739538
  • Title

    Speaker normalization based on the generalized time-frequency representation and Mellin transform

  • Author

    Dongmei, Jiang ; Rongchun, Zhao

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Northwestern Polytech. Univ., Xian, China
  • Volume
    2
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    782
  • Abstract
    For vocal tract length normalization in speaker-independent speech recognition, a novel feature extraction method is carried out on the generalized time-frequency representation with cone-shaped kernel (CK-GTFR) and Mellin transform. The GTFR is superior to other representations in suppressing cross terms and producing good time and frequency resolution simultaneously. Mellin transform makes the features insensitive to different vocal tract lengths. F-ratio tests show that features in this paper have the highest separation ability compared to the FFT cepstrum or FFT-Mellin cepstrum, and are superior to the Mel cepstrum in most cases
  • Keywords
    feature extraction; signal representation; speech recognition; time-frequency analysis; transforms; F-ratio tests; FFT cepstrum; FFT-Mellin cepstrum; Mel cepstrum; Mellin transform; cone-shaped kernel; cross terms suppression; feature extraction method; frequency resolution; generalized time-frequency representation; speaker normalization; speaker-independent speech recognition; time resolution; vocal tract length normalization; vocal tract lengths; Cepstrum; Computer science; Feature extraction; Fourier transforms; Kernel; Loudspeakers; Signal resolution; Speech recognition; Testing; Time frequency analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Proceedings, 2000. WCCC-ICSP 2000. 5th International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    0-7803-5747-7
  • Type

    conf

  • DOI
    10.1109/ICOSP.2000.891628
  • Filename
    891628