• DocumentCode
    3744912
  • Title

    CRIM and LIUM approaches for multi-genre broadcast media transcription

  • Author

    Vishwa Gupta;Paul Del?glise;Gilles Boulianne;Yannick Est?ve;Sylvain Meignier;Anthony Rousseau

  • Author_Institution
    Centre de recherche informatique de Montr?al (CRIM)
  • fYear
    2015
  • Firstpage
    681
  • Lastpage
    686
  • Abstract
    The Multi-Genre Broadcast Challenge at ASRU 2015 is a controlled evaluation of speech recognition, speaker diarization, and lightly supervised alignment using BBC TV recordings. CRIM and LIUM teams participated in the speech recognition part of the challenge with a joint submission. This paper presents the CRIM and LIUM´s contributions. Each team made different choices to develop its ASR system. By the way, it was expected to compare and to evaluate different approaches to diarization and acoustic modeling, and to get complementary ASR systems for effective merging. CRIM´s main contributions are the use of a training scenario similar to multi-lingual training to estimate the deep neural net (DNN) acoustic models with most of the data, the use of a pruned trigram model for search, in addition to the use of a genre-dependent quadgram language model for rescoring the lattice from the search. For LIUM, the focus was on fast decoding with high accuracy. The final word error rates (WER) after merging show that it is possible to get reasonable WER with automatically aligned files. The final global WER of 25.1% corresponds to a WER reduction of about 20% absolute in comparison to the ASR baseline system provided by the organizers.
  • Keywords
    "Training","Acoustics","Speech","Training data","Data models","Hidden Markov models","Speech recognition"
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on
  • Type

    conf

  • DOI
    10.1109/ASRU.2015.7404862
  • Filename
    7404862