مرکز منطقه ای اطلاع رساني علوم و فناوري - Using cross-decoder phone coocurrences in phonotactic language recognition

DocumentCode :

2790390

Title :

Using cross-decoder phone coocurrences in phonotactic language recognition

Author :

Penagarikano, Mikel ; Varona, Amparo ; Rodríguez-Fuentes, Luis Javier ; Bordel, Germán

Author_Institution :

Dept. of Electr. & Electron., GTTS, Univ. of the Basque Country, San Sebastian, Spain

fYear :

2010

fDate :

14-19 March 2010

Firstpage :

5034

Lastpage :

5037

Abstract :

Phonotactic language recognizers are based on the ability of phone decoders to produce phone sequences containing acoustic, phonetic and phonological information, which is partially dependent on the language. Input utterances are decoded and then scored by means of models for the target languages. Commonly, various decoders are applied in parallel and fused at the score level. A kind of complementarity effect is expected when fusing scores, since each decoder is assumed to extract different (and complementary) information from the input utterance. This assumption is supported by the performance improvements attained when fusing systems. However, decodings are processed in a fully uncoupled way, their time alignment (and the information that may be extracted from it) being completely lost. In this paper, a simple approach is proposed, which takes into account time alignment information, by considering cross-decoder phone coocurrences at the frame level. To evaluate the approach, a choice of open software (BUT front-end and phone decoders, SRI-LM toolkit, libSVM, FoCal) is used, and experiments are carried out on the NIST LRE2007 database. Adding phone coocurrences to the baseline phonotactic systems provides slight performance improvements, revealing the potential benefit of using cross-decoder dependencies for language modeling.

Keywords :

crosstalk; decoding; natural language processing; public domain software; speech processing; speech recognition; NIST LRE2007 database; cross-decoder phone coocurrence; open software; phonotactic language recognition; Costs; Data mining; Databases; Decoding; NIST; Natural languages; Software tools; Speech recognition; Statistics; Training data; Language Recognition; Phone Coocurrence; Phone Decoding;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on

Conference_Location :

Dallas, TX

ISSN :

1520-6149

Print_ISBN :

978-1-4244-4295-9

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2010.5495056

Filename :

5495056

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2790390