DocumentCode :
74383
Title :
Language identification for internet security in the basque context: A cross-lingual approach
Author :
Barroso, N. ; Lopez de Ipina, Karmele ; Ezeiza, A. ; Hernandez, C.
Author_Institution :
Polytech. Sch., Univ. of the Basque Country, Bilbao, Spain
Volume :
28
Issue :
8
fYear :
2013
fDate :
Aug. 2013
Firstpage :
24
Lastpage :
31
Abstract :
The present work describes the development of an LID system suited for handling security tasks in the Internet. The development context was the Infozazpi Internet digital radio, and the task presented substantial complexity due to the trilingual environment and the scarcity of language resources for Basque. In order to overcome previous difficulties, we propose a hybrid system based on the selection of subword units by SVMs, MLP classifiers, and discriminant analysis improved with robust regularized covariance matrix estimation methods and stochastic methods for ASR tasks (SC-HMM and n-grams). Our new subword unit proposals and the use of triphones and cross-lingual approaches considerably improve the system performance, achieving an optimal and stable LID recognition rate despite the complexity of the problem.
Keywords :
Internet; covariance matrices; digital radio; estimation theory; hidden Markov models; multilayer perceptrons; natural language processing; security of data; speech recognition; support vector machines; ASR tasks; Basque context; Infozazpi Internet digital radio; Internet security; LID recognition rate; LID system; MLP classifier; SC-HMM; SVM classifier; cross-lingual approach; discriminant analysis; handling security tasks; hybrid system; language identification; language resources; n-grams; robust regularized covariance matrix estimation methods; stochastic methods; subword unit proposals; subword units; system performance; trilingual environment; triphones; Automatic speech recognition; Context awareness; Hidden Markov models; Interent; Natural language processing; Security; Terminology;
fLanguage :
English
Journal_Title :
Aerospace and Electronic Systems Magazine, IEEE
Publisher :
ieee
ISSN :
0885-8985
Type :
jour
DOI :
10.1109/MAES.2013.6575408
Filename :
6575408
Link To Document :
بازگشت