• DocumentCode
    1690155
  • Title

    Supervector pre-processing for PRSVM-based Chinese and Arabic dialect identification

  • Author

    Qian Zhang ; Boril, Hynek ; Hansen, John H. L.

  • Author_Institution
    Center for Robust Speech Syst. (CRSS), Univ. of Texas at Dallas, Richardson, TX, USA
  • fYear
    2013
  • Firstpage
    7363
  • Lastpage
    7367
  • Abstract
    Phonotactic modeling has become a widely used means for speaker, language, and dialect recognition. This paper explores variations to supervector pre-processing for phone recognition-support vector machines (PRSVM) based dialect identification. The aspects studied are: (i) normalization of supervector dimensions in the pre-squashing stage, (ii) impact of alternative squashing functions, and (iii) N-gram selection for supervector dimensionality reduction. In (i) and (ii), we find that several alternatives to commonly used approaches can provide moderate, yet consistent performance improvements. In (iii), a newly proposed dialect salience measure is applied in supervector dimension selection and compared to a common N-gram frequency based selection. The results show a strong correlation between dialect-salience and frequency of occurrence in N-grams. The evaluations in this study are conducted on a corpus of Chinese dialects, a Pan-Arabic corpus, and a set of Arabic CTS corpora.
  • Keywords
    natural language processing; speaker recognition; support vector machines; Arabic CTS corpora; Arabic dialect identification; Chinese dialect identification; N-gram selection; PRSVM; Pan-Arabic corpus; alternative squashing functions; dialect recognition; dialect salience measure; language recognition; phone recognition-support vector machines; phonotactic modeling; pre-squashing stage; speaker recognition; supervector dimension normalization; supervector pre-processing; Boolean functions; Data structures; Frequency estimation; Speech; Support vector machines; Training; Dialect identification; PRSVM; dialect-salience; phonotactic modeling; squashing function;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6639093
  • Filename
    6639093