• DocumentCode
    1062802
  • Title

    High-Pitch Formant Estimation by Exploiting Temporal Change of Pitch

  • Author

    Wang, Tianyu T. ; Quatieri, Thomas F.

  • Author_Institution
    Lincoln Lab., Massachusetts Inst. of Technol., Lexington, MA, USA
  • Volume
    18
  • Issue
    1
  • fYear
    2010
  • Firstpage
    171
  • Lastpage
    186
  • Abstract
    This paper considers the problem of obtaining an accurate spectral representation of speech formant structure when the voicing source exhibits a high fundamental frequency. Our work is inspired by auditory perception and physiological studies implicating the use of pitch dynamics in speech by humans. We develop and assess signal processing schemes aimed at exploiting temporal change of pitch to address the high-pitch formant frequency estimation problem. Specifically, we propose a 2-D analysis framework using 2-D transformations of the time-frequency space. In one approach, we project changing spectral harmonics over time to a 1-D function of frequency. In a second approach, we draw upon previous work of Quatieri and Ezzat , , with similarities to the auditory modeling efforts of Chi , where localized 2-D Fourier transforms of the time-frequency space provide improved source-filter separation when pitch is changing. Our methods show quantitative improvements for synthesized vowels with stationary formant structure in comparison to traditional and homomorphic linear prediction. We also demonstrate the feasibility of applying our methods on stationary vowel regions of natural speech spoken by high-pitch females of the TIMIT corpus. Finally, we show improvements afforded by the proposed analysis framework in formant tracking on examples of stationary and time-varying formant structure.
  • Keywords
    acoustic signal processing; hearing; speech; 2D Fourier transforms; 2D analysis framework; TIMIT corpus; auditory modeling; auditory perception; high-pitch formant estimation; homomorphic linear prediction; natural speech; physiological studies; pitch dynamics; signal processing; source-filter separation; spectral harmonics; spectrotemporal analysis; speech formant structure; stationary vowel regions; temporal pitch change; time-frequency space 2D transformation; voicing source; Formant estimation; high-pitch effects; linear prediction; spectrotemporal analysis; temporal change of pitch;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2009.2024732
  • Filename
    5067370