• DocumentCode
    2021087
  • Title

    Predicting unseen triphones with senones

  • Author

    Hwang, Mei-Yuh ; Huang, Xuedong ; Alleva, Fileno

  • Author_Institution
    Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
  • Volume
    2
  • fYear
    1993
  • fDate
    27-30 April 1993
  • Firstpage
    311
  • Abstract
    In large-vocabulary speech recognition, there are always new triphones that are not covered in the training data. These unseen triphones are usually represented by corresponding diphones or context-independent monophones. It is proposed that decision-tree-based senones be used to generate needed senonic baseforms for unseen triphones. A decision tree is built for each individual Markov state of each phone, and the leaves of the trees constitute the senone codebook. A Markov state of any triphone traverses the corresponding tree until it reaches a leaf to find the senone it is to be associated with. The DARPA 5000-word peaker-independent Wall Street Journal dictation task is used to evaluate the proposed method. The word error rate is reduced by more than 10% when unseen triphones are modeled by the decision-tree-based senones.<>
  • Keywords
    Markov processes; dictation; learning (artificial intelligence); speech recognition; trees (mathematics); vocabulary; Markov state; Wall Street Journal dictation; decision-tree-based senones; large-vocabulary speech recognition; senone codebook; training; unseen triphones; word error rate;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
  • Conference_Location
    Minneapolis, MN, USA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7402-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.1993.319299
  • Filename
    319299