مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

66211

Title :

A Direct Masking Approach to Robust ASR

Author :

Hartmann, W. ; Narayanan, Arun ; Fosler-Lussier, Eric ; DeLiang Wang

Author_Institution :

Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA

Volume :

Issue :

fYear :

2013

fDate :

Oct. 2013

Firstpage :

1993

Lastpage :

2005

Abstract :

Recently, much work has been devoted to the computation of binary masks for speech segregation. Conventional wisdom in the field of ASR holds that these binary masks cannot be used directly; the missing energy significantly affects the calculation of the cepstral features commonly used in ASR. We show that this commonly held belief may be a misconception; we demonstrate the effectiveness of directly using the masked data on both a small and large vocabulary dataset. In fact, this approach, which we term the direct masking approach, performs comparably to two previously proposed missing feature techniques. We also investigate the reasons why other researchers may have not come to this conclusion; variance normalization of the features is a significant factor in performance. This work suggests a much better baseline than unenhanced speech for future work in missing feature ASR.

Keywords :

speech synthesis; binary masks; cepstral features; direct masking approach; masked data; missing feature techniques; robust ASR; speech segregation; unenhanced speech; variance normalization; Direct masking; ideal binary mask; robust automatic speech recognition;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2013.2263802

Filename :

6517211

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=66211