• DocumentCode
    2307203
  • Title

    Directory name retrieval over the telephone in the Picasso project

  • Author

    Neubert, F. ; Gravier, Guillaume ; Yvon, F. ; Chollet, G.

  • Author_Institution
    Ecole Nat. Superieure des Telecommun., Paris, France
  • fYear
    1998
  • fDate
    29-30 Sep 1998
  • Firstpage
    31
  • Lastpage
    36
  • Abstract
    The European project Picasso intends to develop and test several telematics transaction services that will be accessible via the worldwide telephone network. In this framework, ENST works on developing an automated speech recognition system of pronounced and spelled names, for telephone quality speech in French. The recognizer is based on Hidden Markov modeling of speech units using word models for spelled letters and phone models for name pronunciation. Bigram probabilities are introduced at this stage for phonemes and letters, in order to improve the quality of decoding. The directory was built automatically from the list of the names contained in the database, using a grapheme to phoneme converter for the names and rules for spellings, each entry in the directory consisting of several pronunciations and spelling variants. After the acoustic recognition phase, the corresponding entry in the directory is then found using dynamic alignment of symbol sequences, with insertion, deletion and substitution costs determined from the training data to take into account acoustic confusability. As this lexical search is very time consuming for large directories, we present a faster method using pre-selection in a tree-based representation of the lexicon. A rescoring strategy on the 10 best outputs is also evaluated
  • Keywords
    acoustic signal processing; automatic telephone systems; decoding; grammars; hidden Markov models; probability; speech intelligibility; speech recognition; telephone networks; ENST; European project; French; HMM; Hidden Markov modeling; Picasso project; acoustic confusability; acoustic recognition phase; automated speech recognition system; bigram probabilities; database; decoding quality; deletion; directory name retrieval; dynamic alignment; grapheme to phoneme converter; insertion; letters; lexical search; name pronunciation; phone models; phonemes; pre-selection; rescoring strategy; speech units; spelled letters; spelled names; spelling variants; substitution costs; symbol sequences; telematics transaction services; telephone quality speech; training data; tree-based representation; word models; worldwide telephone network; Automatic speech recognition; Costs; Databases; Decoding; Hidden Markov models; Speech recognition; Telematics; Telephony; Testing; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Interactive Voice Technology for Telecommunications Applications, 1998. IVTTA '98. Proceedings. 1998 IEEE 4th Workshop
  • Conference_Location
    Torino
  • Print_ISBN
    0-7803-5028-6
  • Type

    conf

  • DOI
    10.1109/IVTTA.1998.727689
  • Filename
    727689