Using k-Nearest Neighbor and Speaker Ranking for Phoneme Prediction

Author

Rizwan, Muhammad ; Anderson, David V.

Author_Institution

Sch. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA, USA

fYear

2014

fDate

3-6 Dec. 2014

Firstpage

383

Lastpage

387

Abstract

Speech recognition systems are either based on parametric approach or non-parametric approach. Parametric based systems such as HMMs have been the dominant technology for speech recognition in the past decade. Despite a lot of advancements and enhancements in the design of these systems: key problems such as long term temporal dependence, etc. Has not yet been solved. Recently due to availability of large amount of data and cheap computing resources (processing power and memory) non-parametric based approach to solve speech recognition and classification task is becoming popular and feasible. The key advantage of non-parametric based approach is that all the information from the training data is retained as we don´t approximate our data with specific statistical models resulting in more speaker specific information. In this paper we propose a k-nearest neighbor (k-NN) phoneme prediction scheme using speaker ranking vector. Speaker ranking vector is calculated by finding the similarity of the given TEST speaker with the Instance Space using k-NN. The results were compared with nearest neighbor and k-NN majority voting approach. Our proposed scheme gives a better prediction accuracy as compare with nearest neighbor and k-NN majority voting scheme. This approach can help speech recognizer to customize on the fly for a given talker and customize training data on the basis of similarity measure. In this preliminary research we are using a small amount of data to train our phoneme prediction classifier engine. Performance can be further increase by increasing the training data for finding speaker ranking.

Keywords

learning (artificial intelligence); nonparametric statistics; pattern classification; speaker recognition; TEST speaker; instance space; k-NN; k-nearest neighbor phoneme prediction scheme; nonparametric based approach; phoneme prediction classifier engine; speaker ranking vector; speaker specific information; speech classification task; speech recognition systems; statistical models; training data; Accuracy; Hidden Markov models; Prediction algorithms; Speech; Speech recognition; Training; Training data; classification; k-Nearest Neighbor; phoneme; phoneme prediction; speech recognition; template matching;

fLanguage

English

Publisher

ieee

Conference_Titel

Machine Learning and Applications (ICMLA), 2014 13th International Conference on

Conference_Location

Detroit, MI

Type

conf

DOI

10.1109/ICMLA.2014.68

Filename

7033145