Title :
Singing voice recognition considering high-pitched and prolonged sounds
Author_Institution :
Nat. Inst. of Adv. Ind. Sci. & Technol. (AIST), Tsukuba, Japan
Abstract :
A conventional Large Vocabulary Continuous Speech Recognition (LVCSR) system has difficulty recognizing singing voices accurately because both the high-pitched and prolonged sounds of singing voices tend to degrade its recognition accuracy. We previously described an Auto-Regressive Hidden Markov Model (AR-HMM) and an accompanying parameter estimation method. We demonstrated that the AR-HMM accurately estimated the characteristics of both articulatory systems and excitation signals from high-pitched speech. In this paper, we describe an AR-HMM applied to feature extraction from singing voices and propose a prolonged-sound detection and elimination method.
Keywords :
feature extraction; hidden Markov models; parameter estimation; speech recognition; AR-HMM; LVCSR system; articulatory system; autoregressive hidden Markov model; elimination method; excitation signal; feature extraction; large vocabulary continuous speech recognition system; parameter estimation method; prolonged-sound detection; singing voice recognition; Abstracts; Hidden Markov models; Mel frequency cepstral coefficient; Single photon emission computed tomography; Speech;
Conference_Titel :
Signal Processing Conference, 2006 14th European
Conference_Location :
Florence