مرکز منطقه ای اطلاع رساني علوم و فناوري - Improved Features and Models for Detecting Edit Disfluencies in Transcribing Spontaneous Mandarin Speech

DocumentCode :

1135644

Title :

Improved Features and Models for Detecting Edit Disfluencies in Transcribing Spontaneous Mandarin Speech

Author :

Lin, Che-kuang ; Lee, Lin-shan

Author_Institution :

Grad. Inst. of Commun. Eng., Nat. Taiwan Univ., Taipei, Taiwan

Volume :

Issue :

fYear :

2009

Firstpage :

1263

Lastpage :

1278

Abstract :

Detection of edit disfluencies is key to transcribing spontaneous utterances. In this paper, we present improved features and models to detect edit disfluencies and enhance transcription of spontaneous Mandarin speech using hypothesized disfluency interruption points (IPs) and edit word detection. A comprehensive set of prosodic features that takes into account the special characteristics of edit disfluencies in Mandarin is developed, and an improved model combining decision trees and maximum entropy is proposed to detect IPs. This model is further adapted to desired prosodic conditions by latent prosodic modeling, a probabilistic framework for analyzing speech prosody in terms of a set of latent prosodic states. These techniques contribute to higher recognition accuracy (by rescoring with the hypothesized IPs) and better edit word detection (using conditional random fields defined on Chinese characters) in the final transcription, as verified by experiments on a spontaneous Mandarin speech corpus.

Keywords :

decision trees; maximum entropy methods; natural language processing; probability; speech recognition; word processing; decision trees; edit disfluencies; edit word detection; hypothesized disfluency interruption points; latent prosodic modeling; maximum entropy; prosodic features; speech prosody; spontaneous Mandarin speech; spontaneous utterances; transcription; Character recognition; Decision trees; Digital multimedia broadcasting; Entropy; Humans; Information systems; Natural languages; Speech analysis; Speech enhancement; Speech recognition; Edit disfluency; interruption point detection; prosody; speech recognition; spontaneous speech;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2009.2014792

Filename :

5165111

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1135644