Language identification using sparse representation: A comparison between GMM supervector and i-vector based approaches

Author

Singh, O.P. ; Haris, B.C. ; Sinha, Roopak

Author_Institution

Dept. of Electron. & Electr. Eng., Indian Inst. of Technol. Guwahati, Guwahati, India

fYear

2013

fDate

13-15 Dec. 2013

Firstpage

1

Lastpage

4

Abstract

In recent times the sparse representation classification (SRC) has received a lot of attention in many signal processing domains including language identification (LID). Traditionally, in SRC the dictionary is designed to be overcomplete. In case of SRC based LID systems using the GMM mean supervectors as language representation, the resulting dictionary is undercomplete due to lack of data. On the contrast, when lower dimensional i-vectors are used the overcomplete dictionary can be achieved. In this work we have explored the apprehension about the successful sparse coding with an undercomplete dictionary. The experimental studies done on NIST LRE 2007 dataset shows that the performance with the undercomplete dictionary turns out to be better than that with the overcomplete dictionary both with and without channel compensation.

Keywords

Gaussian processes; mixture models; natural language processing; signal classification; signal representation; speech coding; vectors; GMM mean supervectors; Gaussian mixture models; NIST LRE 2007 dataset; SRC based LID systems; i-vector based approaches; language identification; language representation; signal processing domains; sparse coding; sparse representation classification; undercomplete dictionary; Covariance matrices; Dictionaries; Matching pursuit algorithms; NIST; Speech; Training; Vectors; language identification; sparse representation; un-dercomplete dictionary;

fLanguage

English

Publisher

ieee

Conference_Titel

India Conference (INDICON), 2013 Annual IEEE

Conference_Location

Mumbai

Print_ISBN

978-1-4799-2274-1

Type

conf

DOI

10.1109/INDCON.2013.6726125

Filename

6726125