• DocumentCode
    2970927
  • Title

    From speech to letters - using a novel neural network architecture for grapheme based ASR

  • Author

    Eyben, Florian ; Wöllmer, Martin ; Schuller, Björn ; Graves, Alex

  • Author_Institution
    Inst. for Human-Machine Commun., Tech. Univ. Munchen, Munich, Germany
  • fYear
    2009
  • fDate
    Nov. 13 2009-Dec. 17 2009
  • Firstpage
    376
  • Lastpage
    380
  • Abstract
    Main-stream automatic speech recognition systems are based on modelling acoustic sub-word units such as phonemes. Phonemisation dictionaries and language model based decoding techniques are applied to transform the phoneme hypothesis into orthographic transcriptions. Direct modelling of graphemes as sub-word units using HMM has not been successful. We investigate a novel ASR approach using Bidirectional Long Short-Term Memory Recurrent Neural Networks and Connectionist Temporal Classification, which is capable of transcribing graphemes directly and yields results highly competitive with phoneme transcription. In design of such a grapheme based speech recognition system phonemisation dictionaries are no longer required. All that is needed is text transcribed on the sentence level, which greatly simplifies the training procedure. The novel approach is evaluated extensively on the Wall Street Journal 1 corpus.
  • Keywords
    recurrent neural nets; speech coding; speech recognition; bidirectional long short-term memory recurrent neural networks; connectionist temporal classification; decoding techniques; grapheme based speech recognition system phonemisation dictionaries; language model; main-stream automatic speech recognition systems; orthographic transcriptions; Automatic speech recognition; Computer science; Context modeling; Dictionaries; Hidden Markov models; Man machine systems; Natural languages; Neural networks; Recurrent neural networks; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
  • Conference_Location
    Merano
  • Print_ISBN
    978-1-4244-5478-5
  • Electronic_ISBN
    978-1-4244-5479-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2009.5373257
  • Filename
    5373257