مرکز منطقه ای اطلاع رساني علوم و فناوري - Audio-Visual Automatic Speech Recognition for Connected Digits

DocumentCode :

2269023

Title :

Audio-Visual Automatic Speech Recognition for Connected Digits

Author :

Xiaoping Wang ; Hao, Yufeng ; Fu, Degang ; Yuan, Chunwei

Author_Institution :

State Key Lab. of Bioelectronics, Southeast Univ., Nanjing

Volume :

fYear :

2008

fDate :

20-22 Dec. 2008

Firstpage :

328

Lastpage :

332

Abstract :

Audio-visual automatic speech recognition (ASR) is a hotspot in field of human-computer interaction (HCI). This paper implemented an audio-visual ASR for Chinese connected digits and addressed on the method of speech segmentation. A novel speech segmentation approach combining Otsupsilas method with traditional short-time energy and zero-crossing rate (ZCR) based method was proposed. The experimental results showed its efficiency compared with traditional method. Discrete cosine transform (DCT) coefficients and Mel frequency cepstral coefficients (MFCC) were then used as the visual/audio features respectively. After the recognition tasks for speaker-independent ASR were carried out, performances of audio-visual ASR and audio-only ASR under different noisy conditions were compared.

Keywords :

discrete cosine transforms; human computer interaction; speech recognition; Chinese connected digits; Mel frequency cepstral coefficients; audio-only ASR; audio-visual automatic speech recognition; discrete cosine transform coefficients; human-computer interaction; short-time energy-based method; speech segmentation; zero-crossing rate-based method; Automatic speech recognition; Discrete cosine transforms; Feature extraction; Flowcharts; Hidden Markov models; Human computer interaction; Image segmentation; Mel frequency cepstral coefficient; Neural networks; Skin; Otsu´s method; audio-visual automatic speech recognition; endpoint detection; speech segmentation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Information Technology Application, 2008. IITA '08. Second International Symposium on

Conference_Location :

Shanghai

Print_ISBN :

978-0-7695-3497-8

Type :

conf

DOI :

10.1109/IITA.2008.82

Filename :

4740012

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2269023