Filtering on the temporal probability sequence in histogram equalization for robust speech recognition

Author

Syu-Siang Wang ; Yu Tsao ; Jeih-weih Hung

Author_Institution

Res. Center for Inf. Technol. Innovation, Taipei, Taiwan

fYear

2013

Firstpage

7112

Lastpage

7116

Abstract

In this paper, we propose a filter-based histogram equalization (FHEQ) approach for robust speech recognition. The FHEQ approach first represents the original acoustic feature sequence with statistic probability. Then, a temporal average (TA) filter is applied to smooth the statistic probability sequence. Finally, the filtered statistic probability sequence is transformed to form a new acoustic feature stream. Filtering on statistic probability of a feature sequence is a novel concept that can incorporate the advantages of the conventional histogram equalization (HEQ) and temporal filtering techniques for better noise robustness. Our experimental results on the Aurora-2 and Aurora-4 tasks show that FHEQ outperforms the conventional cepstral mean subtraction (CMS), cepstral mean and variance normalization (CMVN), and HEQ. Furthermore, we conducted a comparison test on TA-HEQ and HEQ-TA, which apply a TA filter to smooth acoustic features before and after the HEQ processing, respectively. The test results show that FHEQ outperforms both TA-HEQ and HEQ-TA, suggesting that filtering in probability is more effective than filtering in acoustic feature.

Keywords

acoustic signal processing; smoothing methods; speech recognition; statistics; Aurora-2 tasks; Aurora-4 tasks; FHEQ approach; HEQ-TA; TA filter; TA-HEQ; acoustic feature sequence representation; acoustic feature stream; filter-based histogram equalization approach; noise robustness; robust speech recognition; statistic probability sequence smoothing; temporal average filter; temporal probability sequence filtering; Acoustics; Histograms; Noise; Robustness; Speech; Speech recognition; Training; FHEQ; HEQ; feature normalization; noise robust speech recognition; temporal filter;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location

Vancouver, BC

ISSN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2013.6639042

Filename

6639042