DocumentCode
2702921
Title
Voice-Melody Transcription Under a Speech Recognition Framework
Author
Dan-ning Jiang ; Picheny, Michael ; Yong Qin
Author_Institution
IBM China Res. Lab, China
Volume
4
fYear
2007
fDate
15-20 April 2007
Abstract
This paper presents a robust voice-melody transcription system using a speech recognition framework. While many previous voice-melody transcription systems have utilized non-statistical approaches, statistical recognition technology can potentially achieve more robust results. A cepstrum-based acoustic model is employed to avoid the hard-decisions that have to be made when using explicit voiced-unvoiced segmentation and pitch extraction, and a key-independent 4-gram language model is employed to capture prior probabilities of different melodic sequences. Evaluations are done from the perspective of both note recognition error rate and query-by-humming end-to-end performance. The results are compared with three other voice-melody transcription systems. Experiments have shown that our system is state-of-the-art: it is much more robust than other systems on data containing noise, and close to the best of all the systems on the clean data set.
Keywords
acoustic signal processing; speech processing; speech recognition; statistics; cepstrum-based acoustic model; key-independent 4-gram language model; melodic sequences; pitch extraction; query-by-humming end-to-end performance; recognition error rate; speech recognition framework; statistical recognition technology; voice-melody transcription system; voiced-unvoiced segmentation; Cepstral analysis; Data mining; Databases; Error analysis; Hidden Markov models; Humans; Music information retrieval; Noise robustness; Probability; Speech recognition; Query-by-Humming; voice-melody transcription;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location
Honolulu, HI
ISSN
1520-6149
Print_ISBN
1-4244-0727-3
Type
conf
DOI
10.1109/ICASSP.2007.366988
Filename
4218176
Link To Document