• DocumentCode
    179524
  • Title

    Extended phone log-likelihood ratio features and acoustic-based i-vectors for language recognition

  • Author

    D´Haro, Luis Fernando ; Cordoba, Ricardo ; Salamea, C. ; Echeverry, J.D.

  • Author_Institution
    E.T.S.I. Telecomun., Dept. de Ing. Electron., Univ. Politec. de Madrid, Madrid, Spain
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    5342
  • Lastpage
    5346
  • Abstract
    This paper presents new techniques with relevant improvements added to the primary system presented by our group to the Albayzin 2012 LRE competition, where the use of any additional corpora for training or optimizing the models was forbidden. In this work, we present the incorporation of an additional phonotactic subsystem based on the use of phone log-likelihood ratio features (PLLR) extracted from different phonotactic recognizers that contributes to improve the accuracy of the system in a 21.4% in terms of Cavg (we also present results for the official metric during the evaluation, Fact). We will present how using these features at the phone state level provides significant improvements, when used together with dimensionality reduction techniques, especially PCA. We have also experimented with applying alternative SDC-like configurations on these PLLR features with additional improvements. Also, we will describe some modifications to the MFCC-based acoustic i-vector system which have also contributed to additional improvements. The final fused system outperformed the baseline in 27.4% in Cavg.
  • Keywords
    acoustic signal processing; feature extraction; natural language processing; principal component analysis; speaker recognition; Albayzin 2012 LRE competition; MFCC-based acoustic i-vector system; PCA; PLLR; SDC-like configurations; dimensionality reduction techniques; language recognition; phone log-likelihood ratio features; phone state level; phonotactic recognizer; phonotactic subsystem; speaker recognition systems; Mel frequency cepstral coefficient; Principal component analysis; Speaker recognition; Speech; Speech processing; Speech recognition; Phone Log-Likelihood Ratios; SDC; dimensionality reduction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854623
  • Filename
    6854623