DocumentCode :
2788233
Title :
Vocabulary and language model adaptation using just one speech file
Author :
Meng, S. ; Thambiratnam, K. ; Lin, Y. ; Wang, L. ; Li, G. ; Seide, F.
Author_Institution :
5F Beijing Sigma Center, Microsoft Res. Asia, Beijing, China
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
5410
Lastpage :
5413
Abstract :
This paper investigates unsupervised vocabulary and language model self-adaptation (VLA) from just one speech file using the web as a knowledge source and without prior knowledge of topic or domain beyond optional file metadata. Single-file self adaptation is regularly used for acoustic adaptation, but to date, is rarely used for VLA. The method investigated here uses a first-pass transcript or file metadata to generate web search queries for retrieving texts for adaptation. Various strategies for building queries, retrieving web texts and maximizing out-of-vocabulary (OOV) recovery while constraining vocabulary growth are examined. Significant improvements are demonstrated for transcribing and searching recorded lectures and telephone calls. The proposed method is orthogonal with acoustic adaptation and system combination and integrates well in multi-pass recognition architectures.
Keywords :
speech recognition; unsupervised learning; vocabulary; acoustic adaptation; knowledge source; multi pass recognition architecture; optional file metadata; out of vocabulary recovery; searching recorded lecture; single file self adaptation; speech file; telephone call; transcribing recorded lecture; vocabulary and language model self adaptation; web search query; Acoustical engineering; Adaptation model; Asia; Humans; Knowledge engineering; Natural languages; Search engines; Speech recognition; Vocabulary; Web search; Language Model Adaptation; Out-Of-Vocabulary (OOV); Spoken Document Retrieval; Unsupervised Adaptation; Vocabulary Adaptation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5494929
Filename :
5494929
Link To Document :
بازگشت