Title :
On the Projection of PLLRs for Unbounded Feature Distributions in Spoken Language Recognition
Author :
Diez, Mireia ; Varona, Amparo ; Penagarikano, Mike ; Rodriguez-Fuentes, Luis Javier ; Bordel, German
Author_Institution :
Dept. of Electr. & Electron., Univ. of the Basque Country, Leioa, Spain
Abstract :
The so called Phone Log-Likelihood Ratio (PLLR) features have been recently introduced as a novel and effective way of retrieving acoustic-phonetic information in spoken language and speaker recognition systems. In this letter, an in-depth insight into the PLLR feature space is provided and the multidimensional distribution of these features is analyzed in a language recognition system. The study reveals that PLLR features are confined into a subspace that strongly bounds PLLR distributions. To enhance the information retrieved by the system, PLLR features are projected into a hyper-plane that provides a more suitable representation of the subspace where the features lie. After applying the projection method, PCA is used to decorrelate the features. Gains attained on each step of the proposed approach are outlined and compared to simple PCA projection. Experiments carried out on NIST 2007, 2009 and 2011 LRE datasets demonstrate the effectiveness of the proposed method, which yields up to a 27% relative improvement with regard to the system based on the original features.
Keywords :
feature extraction; maximum likelihood estimation; multidimensional systems; speaker recognition; NIST LRE datasets; PLLR projection; acoustic-phonetic information; language recognition system; multidimensional distribution; phone log-likelihood ratio; speaker recognition; spoken language recognition; unbounded feature distributions; Decoding; Mel frequency cepstral coefficient; NIST; Principal component analysis; Vectors; Feature projection; i-vectors; phone log-likelihood ratios; spoken language recognition;
Journal_Title :
Signal Processing Letters, IEEE
DOI :
10.1109/LSP.2014.2324819