مرکز منطقه ای اطلاع رساني علوم و فناوري - Use of higher level linguistic structure in acoustic modeling for speech recognition

DocumentCode :

352341

Title :

Use of higher level linguistic structure in acoustic modeling for speech recognition

Author :

Shafran, Izhak ; Ostendorf, Mari

Author_Institution :

Dept. of Electr. Eng., Washington Univ., Seattle, WA, USA

Volume :

fYear :

2000

fDate :

2000

Abstract :

Current speech recognition systems perform poorly on conversational speech as compared to read speech, largely because of the additional acoustic variability observed in conversational speech. Our hypothesis is that there are systematic effects, related to higher level structures, that are not being captured in the current acoustic models. In this paper we describe a method to extend standard clustering to incorporate such features in estimating acoustic models. We report recognition improvements obtained on the Switchboard task over triphones and pentaphones by the use of word- and syllable-level features. In addition, we report preliminary studies on clustering with prosodic information

Keywords :

linguistics; modelling; speech recognition; Switchboard task; acoustic modeling; acoustic variability; conversational speech; higher level linguistic structure; higher level structures; pentaphones; prosodic information; read speech; speech recognition; standard clustering; syllable-level features; triphones; word-level features; Automatic speech recognition; Broadcasting; Context modeling; Error analysis; Explosions; Labeling; Loudspeakers; Performance gain; Speech recognition; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on

Conference_Location :

Istanbul

ISSN :

1520-6149

Print_ISBN :

0-7803-6293-4

Type :

conf

DOI :

10.1109/ICASSP.2000.859136

Filename :

859136

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=352341