• DocumentCode
    3486254
  • Title

    Detection of precisely transcribed parts from inexact transcribed corpus

  • Author

    Ohta, Kengo ; Tsuchiya, Masatoshi ; Nakagawa, Seiichi

  • Author_Institution
    Dept. of Inf. & Comput. Sci., Toyohashi Univ. of Technol., Toyohashi, Japan
  • fYear
    2011
  • fDate
    11-15 Dec. 2011
  • Firstpage
    541
  • Lastpage
    546
  • Abstract
    Although large-scale spontaneous speech corpora are crucial resource for various domains of spoken language processing, they are usually limited due to their construction cost especially in transcribing precisely. On the other hand, inexact transcribed corpora like shorthand notes, meeting records and closed captions are widely available. Unfortunately, it is difficult to use them directly as speech corpora for learning acoustic models, because they contain two kinds of text, precisely transcribed parts and edited parts. In order to resolve this problem, this paper proposes an automatic detection method of precisely transcribed parts from inexact transcribed corpora. Our method consists of two steps: the first step is an automatic alignment between the inexact transcription and its corresponding utterance, and the second step is a support vector machine based detector of precisely transcribed parts using several features obtained by the first step. Experiments using the Japanese National Diet Record shows that automatic detection of precise parts is effective for lightly supervised speaker adaptation, and shows that it achieves reasonable performance to reduce the converting cost from inexact transcribed corpora into precisely transcribed ones.
  • Keywords
    speaker recognition; speech processing; support vector machines; Japanese National Diet Record; acoustic model; automatic alignment; automatic detection method; inexact transcribed corpus; large scale spontaneous speech corpora; precisely transcribed parts detection; speaker adaptation; spoken language processing; support vector machine based detector; Acoustics; Adaptation models; Hidden Markov models; Predictive models; Speech; Support vector machines; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
  • Conference_Location
    Waikoloa, HI
  • Print_ISBN
    978-1-4673-0365-1
  • Electronic_ISBN
    978-1-4673-0366-8
  • Type

    conf

  • DOI
    10.1109/ASRU.2011.6163989
  • Filename
    6163989