• DocumentCode
    78245
  • Title

    Event-Based Method for Instantaneous Fundamental Frequency Estimation from Voiced Speech Based on Eigenvalue Decomposition of the Hankel Matrix

  • Author

    Jain, Paril ; Pachori, Ram Bilas

  • Author_Institution
    Discipline of Electr. Eng., Indian Inst. of Technol. Indore, Indore, India
  • Volume
    22
  • Issue
    10
  • fYear
    2014
  • fDate
    Oct. 2014
  • Firstpage
    1467
  • Lastpage
    1482
  • Abstract
    We propose a robust event-based method for estimation of the instantaneous fundamental frequency of a voiced speech signal. The amplitude and frequency modulated (AM-FM) signal model of voiced speech in the low frequency range (LFR) indicates the presence of energy only around its instantaneous fundamental frequency ( F0) and its few harmonics. The time-varying F0 component of a voiced speech signal is extracted by a robust algorithm which iteratively performs eigenvalue decomposition (EVD) of the Hankel matrix, initially constructed from samples of the LFR filtered voiced speech signal. The negative cycles of the extracted time-varying F0 component provide a reliable coarse estimate of intervals where glottal closure instants (GCIs) may be present. The negative cycles of the LFR filtered voiced speech signal occurring within these intervals are isolated. There is a sudden decrease in the glottal impedance at GCIs resulting in high signal strength. Therefore, GCIs are detected as local minima in the derivative of the falling edges of the isolated negative cycles of the LFR filtered voiced speech signal, followed by a selection criterion to discard false GCI candidates. The instantaneous F0 is estimated as the inverse of the time interval between two consecutive GCIs. Experiments were performed on the Keele and CSTR speech databases in white and babble noise environments at various levels of degradation to assess the performance of the proposed method. The proposed method substantially reduces the gross F0 estimation errors in comparison to some state of the art methods.
  • Keywords
    Hankel matrices; eigenvalues and eigenfunctions; frequency estimation; speech synthesis; time-varying systems; CSTR speech databases; EVD; Eigenvalue decomposition; Hankel matrix; Keele speech databases; LFR filtered voiced speech signal; amplitude modulated signal; eigenvalue decomposition; event-based method; frequency modulated signal; fundamental frequency; glottal closure instants; glottal impedance; harmonics; instantaneous fundamental frequency estimation; low frequency range; robust event-based method; time-varying component; Eigenvalues and eigenfunctions; Harmonic analysis; Noise; Noise measurement; Speech; Speech processing; Time-frequency analysis; Eigenvalue decomposition; Hankel matrix; instantaneous fundamental frequency; speech signal processing;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2014.2335056
  • Filename
    6847702