Title :
Real-Time Speaker Adaptation for Speech Recognition on Mobile Devices
Author_Institution :
Comput. Sci. Lab., Samsung Adv. Inst. of Technol. Samsung Electron., Yongin, South Korea
Abstract :
This paper introduce a real-time speaker adaptation method for speech recognition on mobile devices. In order to adapt speech recognition system to any speakers, we employ vocal tract length normalization (VTLN). In conventional VTLN, warping factors are computed by maximum likelihood estimation. After all possible warping factors are applied to speech recognition, the best warping factor is selected corresponding to speaker or speech. Therefore it is not efficient for mobile devices because of expensive computation although its performance is good. To reduce computational effort, we employ pitch-based VTLN and simplify pitch estimation. The proposed method gives the relative word error rate reduction by 21.5% in Korean while the speed is slower by 10.5% as compared to the baseline.
Keywords :
maximum likelihood estimation; mobile handsets; speech recognition; VTLN; maximum likelihood estimation; mobile devices; pitch estimation; real-time speaker adaptation method; speech recognition; vocal tract length normalization; warping factors; word error rate reduction; Adaptation model; Communications Society; Computer science; Filter bank; Gas insulated transmission lines; Maximum likelihood linear regression; Mobile computing; Paper technology; Resonant frequency; Speech recognition;
Conference_Titel :
Consumer Communications and Networking Conference (CCNC), 2010 7th IEEE
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-5175-3
Electronic_ISBN :
978-1-4244-5176-0
DOI :
10.1109/CCNC.2010.5421765