Title :
Detection of precisely transcribed parts from inexact transcribed corpus
Author :
Ohta, Kengo ; Tsuchiya, Masatoshi ; Nakagawa, Seiichi
Author_Institution :
Dept. of Inf. & Comput. Sci., Toyohashi Univ. of Technol., Toyohashi, Japan
Abstract :
Although large-scale spontaneous speech corpora are crucial resource for various domains of spoken language processing, they are usually limited due to their construction cost especially in transcribing precisely. On the other hand, inexact transcribed corpora like shorthand notes, meeting records and closed captions are widely available. Unfortunately, it is difficult to use them directly as speech corpora for learning acoustic models, because they contain two kinds of text, precisely transcribed parts and edited parts. In order to resolve this problem, this paper proposes an automatic detection method of precisely transcribed parts from inexact transcribed corpora. Our method consists of two steps: the first step is an automatic alignment between the inexact transcription and its corresponding utterance, and the second step is a support vector machine based detector of precisely transcribed parts using several features obtained by the first step. Experiments using the Japanese National Diet Record shows that automatic detection of precise parts is effective for lightly supervised speaker adaptation, and shows that it achieves reasonable performance to reduce the converting cost from inexact transcribed corpora into precisely transcribed ones.
Keywords :
speaker recognition; speech processing; support vector machines; Japanese National Diet Record; acoustic model; automatic alignment; automatic detection method; inexact transcribed corpus; large scale spontaneous speech corpora; precisely transcribed parts detection; speaker adaptation; spoken language processing; support vector machine based detector; Acoustics; Adaptation models; Hidden Markov models; Predictive models; Speech; Support vector machines; Training;
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location :
Waikoloa, HI
Print_ISBN :
978-1-4673-0365-1
Electronic_ISBN :
978-1-4673-0366-8
DOI :
10.1109/ASRU.2011.6163989