Detection of precisely transcribed parts from inexact transcribed corpus

Author

Ohta, Kengo ; Tsuchiya, Masatoshi ; Nakagawa, Seiichi

Author_Institution

Dept. of Inf. & Comput. Sci., Toyohashi Univ. of Technol., Toyohashi, Japan

fYear

2011

fDate

11-15 Dec. 2011

Firstpage

541

Lastpage

546

Abstract

Although large-scale spontaneous speech corpora are crucial resource for various domains of spoken language processing, they are usually limited due to their construction cost especially in transcribing precisely. On the other hand, inexact transcribed corpora like shorthand notes, meeting records and closed captions are widely available. Unfortunately, it is difficult to use them directly as speech corpora for learning acoustic models, because they contain two kinds of text, precisely transcribed parts and edited parts. In order to resolve this problem, this paper proposes an automatic detection method of precisely transcribed parts from inexact transcribed corpora. Our method consists of two steps: the first step is an automatic alignment between the inexact transcription and its corresponding utterance, and the second step is a support vector machine based detector of precisely transcribed parts using several features obtained by the first step. Experiments using the Japanese National Diet Record shows that automatic detection of precise parts is effective for lightly supervised speaker adaptation, and shows that it achieves reasonable performance to reduce the converting cost from inexact transcribed corpora into precisely transcribed ones.

Keywords

speaker recognition; speech processing; support vector machines; Japanese National Diet Record; acoustic model; automatic alignment; automatic detection method; inexact transcribed corpus; large scale spontaneous speech corpora; precisely transcribed parts detection; speaker adaptation; spoken language processing; support vector machine based detector; Acoustics; Adaptation models; Hidden Markov models; Predictive models; Speech; Support vector machines; Training;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on

Conference_Location

Waikoloa, HI

Print_ISBN

978-1-4673-0365-1

Electronic_ISBN

978-1-4673-0366-8

Type

conf

DOI

10.1109/ASRU.2011.6163989

Filename

6163989