Pair-wise language discrimination using phonotactic information

Author

Lekshmi M Nair;Leena Mary

Author_Institution

Department of ECE, RIT, Kottayam, India

fYear

2015

Firstpage

544

Lastpage

547

Abstract

This paper describes a novel method for automatic language identification using phonotactics. Conventional phonotactic approach using N-gram language modeling requires several hours of speech data along with the corresponding orthographic transcriptions, which is not available for many of the Indian languages. This paper proposes a method which captures the language discriminating cue in co-occurance of phones using limited data. Here speech utterance is decoded into a sequence of chosen phones using an automatic phone recognizer. A unique code is assigned for each phone to obtain feature vectors corresponding to five consecutive phones. These feature vectors are then used to train a neural network / SVM based classifier at the back-end. A pair-wise language discrimination system for Hindi and Malayalam is developed using manual and automatic transcriptions.

Keywords

"Speech","Speech recognition","Feature extraction","Support vector machines","Engines","Acoustics","Speech processing"

Publisher

ieee

Conference_Titel

Control Communication & Computing India (ICCC), 2015 International Conference on

Type

conf

DOI

10.1109/ICCC.2015.7432957

Filename

7432957

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3760810