Title :
Statistical Framework with Knowledge Base Integration for Robust Speech Understanding of the Tunisian Dialect
Author :
Graja, M. ; Jaoua, M. ; Belguith, L. Hadrich
Author_Institution :
ANLP-RG of the Miracl Lab. (Multimedia, Univ. of Sfax, Sfax, Tunisia
Abstract :
In this paper, we propose a hybrid method for the spoken Tunisian dialect understanding within a limited task. This method couples a discriminative statistical method with a domain ontology. The statistical method is based on conditional random field (CRF) models learned from a little size corpus to perform conceptual labeling task. These models are able to detect the semantic dependency between words. However, the domain ontology is used to add prior knowledge about the task. Our experiments are based on a real spoken Tunisian dialect corpus. The obtained results show that the proposed method is able to improve the performance of CRF models for speech understanding by the integration of the domain ontology. Our method can be exploited for under-resourced languages and Arabic dialects to overcome the lack of linguistic resources .
Keywords :
natural language processing; ontologies (artificial intelligence); speech processing; statistical analysis; CRF models; conditional random field models; discriminative statistical method; domain ontology; knowledge base integration; real spoken Tunisian dialect corpus; robust speech understanding; semantic dependency; statistical framework; Biological system modeling; Hidden Markov models; Knowledge management; Ontologies; Semantics; Conditional random field (CRF); Tunisian dialect (TD); domain ontology; knowledge base; speech understanding; statistical models;
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
DOI :
10.1109/TASLP.2015.2464687