مرکز منطقه ای اطلاع رساني علوم و فناوري - Attribute based lattice rescoring in spontaneous speech recognition

DocumentCode :

178740

Title :

Attribute based lattice rescoring in spontaneous speech recognition

Author :

I-Fan Chen ; Siniscalchi, Sabato Marco ; Chin-Hui Lee

Author_Institution :

Sch. of ECE, Georgia Inst. of Technol., Atlanta, GA, USA

fYear :

2014

fDate :

4-9 May 2014

Firstpage :

3325

Lastpage :

3329

Abstract :

In this paper we extend attribute-based lattice rescoring to spontaneous speech recognition. This technique is based on two key features: (i) an attribute-based frontend, which consists of a bank of speech attribute detectors followed up by an evidence merger that generates confidence scores (e.g., sub-word posterior probabilities), and (ii) a rescoring module that integrates information generated by the frontend into an existing ASR engine through lattice rescoring. The speech attributes used in this work are phonetic features, such as frication and palatalization. Experimental results on the Switchboard part of the NIST 2000 Hub5 data set demonstrate that the proposed approach outperforms LVCSR systems based on Gaussian mixture model/ hidden Markov model (GMM/HMM) that does not use attribute related information. Furthermore, a small yet promising improvement is also observed when rescoring word-lattices generated by a state-of-the-art ASR system using deep neural networks. Different frontend configuration are investigated and tested.

Keywords :

feature extraction; neural nets; probability; speech recognition; ASR engine; GMM; Gaussian mixture model; HMM; LVCSR system; NIST 2000 Hub5 data set; Switchboard; attribute based lattice rescoring; attribute related information; attribute-based frontend; confidence scores; deep neural network; evidence merger; frication; frontend configuration; hidden Markov model; information integration; palatalization; phonetic features; rescoring module; speech attribute detectors; speech attributes; spontaneous speech recognition; subword posterior probabilities; Acoustics; Corporate acquisitions; Detectors; Hidden Markov models; Lattices; Speech; Speech recognition; Artificial Neural Networks; Automatic Speech Recognition; Lattice Rescoring; Phonetic Features;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location :

Florence

Type :

conf

DOI :

10.1109/ICASSP.2014.6854216

Filename :

6854216

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=178740