• DocumentCode
    3529308
  • Title

    Voiced/unvoiced pattern-based duration modeling for language identification

  • Author

    Yin, Bo ; Ambikairajah, Eliathamby ; Chen, Fang

  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    4341
  • Lastpage
    4344
  • Abstract
    Most existing duration modeling approaches facilitates phone recognizer and require manually annotated corpus to train the segmentation models, which is usually cost- and time-consuming. In this paper, a novel duration modeling approach is proposed, which does not require phone recognizer/annotated training data, and facilitates fast computation of language identification. In this approach, the segmentation is implemented by using articulatory features like voicing status. A pair of connected unvoiced and voiced segments is considered as the unit, and the duration of each segment is normalized for each utterance and then quantized into 20 discrete ranges. The ranges of units are later considered as symbol sequences and are modeled by n-gram models, to capture the temporal pattern, which is hypothesized to vary in different languages. The experiments based on the NIST LRE 2005 tasks show a relative 19.7% EER improvement by introducing the proposed duration modeling-based system into a fusion system containing two GMM-UBM based acoustic systems using MFCC and pitch+intensity features.
  • Keywords
    natural language processing; speech recognition; GMM-UBM; MFCC; acoustic systems; duration modeling approaches; duration modeling-based system; fusion system; language identification; phone recognizer; pitch+intensity features; segmentation models; unvoiced pattern; voicing status; Australia; Data mining; Loudspeakers; Mel frequency cepstral coefficient; NIST; Natural languages; Pattern recognition; Speech recognition; Target recognition; Training data; articulatory features; duration modeling; language identification; quantization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4960590
  • Filename
    4960590