• DocumentCode
    1448126
  • Title

    Supervector Dimension Reduction for Efficient Speaker Age Estimation Based on the Acoustic Speech Signal

  • Author

    Dobry, Gil ; Hecht, Ron M. ; Avigal, Mireille ; Zigel, Yaniv

  • Author_Institution
    Open Univ. of Israel, Ra´´anana, Israel
  • Volume
    19
  • Issue
    7
  • fYear
    2011
  • Firstpage
    1975
  • Lastpage
    1985
  • Abstract
    This paper presents a novel dimension reduction method which aims to improve the accuracy and the efficiency of speaker´s age estimation systems based on speech signal. Two different age estimation approaches were studied and implemented; the first, age-group classification, and the second, precise age estimation using regression. These two approaches use the Gaussian mixture model (GMM) supervectors as features for a support vector machine (SVM) model. When a radial basis function (RBF) kernel is used, the accuracy is improved compared to using a linear kernel; however, the computation complexity is more sensitive to the feature dimension. Classic dimension reduction methods like principal component analysis (PCA) and linear discriminant analysis (LDA) tend to eliminate the relevant feature information and cannot always be applied without damaging the model´s accuracy. In our study, a novel dimension reduction method was developed, the weighted-pairwise principal components analysis (WPPCA) based on the nuisance attribute projection (NAP) technique. This method projects the supervectors to a reduced space where the redundant within-class pairwise variability is eliminated. This method was applied and compared to the baseline system where no dimensionality reduction is done on the supervectors. The conducted experiments showed a dramatic speed-up in the SVM training testing time using reduced feature vectors. The system accuracy was improved by 5% for the classification system and by 10% for the regression system using the proposed dimension reduction method.
  • Keywords
    Gaussian processes; acoustic signal processing; principal component analysis; radial basis function networks; regression analysis; speech processing; support vector machines; GMM; Gaussian mixture model; NAP technique; SVM model; SVM training testing; acoustic speech signal; age-group classification; class pairwise variability; classic dimension reduction methods; computational complexity; linear discriminant analysis; linear kernel; nuisance attribute projection technique; principal component analysis; radial basis function kernel; regression analysis; speaker age estimation system; supervector dimension reduction method; support vector machine model; weighted-pairwise principal components analysis; Adaptation model; Estimation; Principal component analysis; Support vector machine classification; Testing; Training; Age estimation; Gaussian mixture model (GMM) supervector; dimension reduction; support vector machine (SVM); support vector regression;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2011.2104955
  • Filename
    5711646