• DocumentCode
    178032
  • Title

    Training Pairwise Support Vector Machines with large scale datasets

  • Author

    Cumani, Sandro ; Laface, Pietro

  • Author_Institution
    Politec. di Torino, Turin, Italy
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    1645
  • Lastpage
    1649
  • Abstract
    We recently presented an efficient approach for training a Pairwise Support Vector Machine (PSVM) with a suitable kernel for a quite large speaker recognition task. The PSVM approach, rather than estimating an SVM model per class according to the “one versus all” discriminative paradigm, classifies pairs of examples as belonging or not to the same class. Training a PSVM with large amount of data, however, is a memory and computational expensive task, because the number of training pairs grows quadratically with the number of training patterns. This paper proposes an approach that allows discarding the training pairs that do not essentially contribute to the set of Support Vectors (SVs) of the training set. This selection of training pairs is feasible because we show that the number of SVs does not grow quadratically, with the number of pairs, but only linearly with the number of speakers in the training set. Our approach dramatically reduces the memory and computational complexity of PSVM training, making possible the use of large datasets, including many speakers. It has been assessed on the extended core conditions of the 2012 Speaker Recognition Evaluation. The results show that the accuracy of the trained PSVMs increases with the training set size, and that the Cprimary of a PSVM trained with a small subset of the i-vectors pairs is 10-30% better than the one obtained by a generative model trained on the complete set of i-vectors.
  • Keywords
    computational complexity; speaker recognition; support vector machines; vectors; PSVM training; computational complexity; computational expensive task; discriminative paradigm; generative model; i-vectors pairs; large scale datasets; memory task; pairwise support vector machine training; speaker recognition task; suitable kernel; training pairs; training patterns; training set size; Computational modeling; Kernel; NIST; Speaker recognition; Support vector machines; Training; Vectors; PLDA; Pairwise Support Vector Machines; Speaker recognition; Support Vectors; i-vector;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6853877
  • Filename
    6853877