• DocumentCode
    1336394
  • Title

    Integration of phonetic and prosodic information for robust utterance verification

  • Author

    Wu, C.-H. ; Chen, Y.-J. ; Yan, G.-L.

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
  • Volume
    147
  • Issue
    1
  • fYear
    2000
  • fDate
    2/1/2000 12:00:00 AM
  • Firstpage
    55
  • Lastpage
    61
  • Abstract
    Mandarin speech is known for its tonal characteristic, and prosodic information plays an important role in Mandarin speech recognition. Driven by this property, phonetic and prosodic information are integrated and used for Mandarin telephone speech keyword spotting. A two-stage strategy, with recognition followed by verification, is adopted. For keyword recognition, 132 subsyllable models, two general acoustic filler models and one background/silence model are separately trained and used as the basic recognition units. For utterance verification, 12 anti-subsyllable models, 175 context-dependent prosodic models and five anti-prosodic models are constructed. A keyword verification function combining phonetic-phase and prosodic-phase verification is investigated. Using a test set of 3088 conversational speech utterances from 33 speakers (20 males and 13 females) and a vocabulary of 2583 faculty names, at 8.5% false rejection, the proposed verification method results in an 18.3% false alarm rate. Furthermore, this method is able correctly to reject 90.9% of non-keywords. Comparison with a baseline system without prosodic-phase verification shows that prosodic information can benefit the verification performance
  • Keywords
    feature extraction; natural languages; speech processing; speech recognition; Mandarin speech; Mandarin speech recognition; Mandarin telephone speech keyword spotting; acoustic filler models; anti-prosodic models; anti-subsyllable models; background/silence model; baseline system; context-dependent prosodic models; keyword recognition; phonetic information; prosodic information; prosodic-phase verification; robust utterance verification; tonal characteristic; two-stage strategy; verification performance;
  • fLanguage
    English
  • Journal_Title
    Vision, Image and Signal Processing, IEE Proceedings -
  • Publisher
    iet
  • ISSN
    1350-245X
  • Type

    jour

  • DOI
    10.1049/ip-vis:20000099
  • Filename
    842718