• DocumentCode
    730814
  • Title

    Librispeech: An ASR corpus based on public domain audio books

  • Author

    Panayotov, Vassil ; Guoguo Chen ; Povey, Daniel ; Khudanpur, Sanjeev

  • Author_Institution
    Center for Language & Speech Process., Johns Hopkins Univ., Baltimore, MD, USA
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    5206
  • Lastpage
    5210
  • Abstract
    This paper introduces a new corpus of read English speech, suitable for training and evaluating speech recognition systems. The LibriSpeech corpus is derived from audiobooks that are part of the LibriVox project, and contains 1000 hours of speech sampled at 16 kHz. We have made the corpus freely available for download, along with separately prepared language-model training data and pre-built language models. We show that acoustic models trained on LibriSpeech give lower error rate on the Wall Street Journal (WSJ) test sets than models trained on WSJ itself. We are also releasing Kaldi scripts that make it easy to build these systems.
  • Keywords
    natural language processing; speech recognition; ASR corpus; Kaldi scripts; LibriSpeech corpus; LibriVox project; WSJ; Wall Street Journal; acoustic models; evaluating speech recognition systems; frequency 16 kHz; language-model training data; pre-built language models; public domain audio books; read english speech; training speech recognition systems; Bioinformatics; Blogs; Electronic publishing; Genomics; Information services; Resource description framework; Corpus; LibriVox; Speech Recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178964
  • Filename
    7178964