• DocumentCode
    3604665
  • Title

    Statistical Framework with Knowledge Base Integration for Robust Speech Understanding of the Tunisian Dialect

  • Author

    Graja, M. ; Jaoua, M. ; Belguith, L. Hadrich

  • Author_Institution
    ANLP-RG of the Miracl Lab. (Multimedia, Univ. of Sfax, Sfax, Tunisia
  • Volume
    23
  • Issue
    12
  • fYear
    2015
  • Firstpage
    2311
  • Lastpage
    2321
  • Abstract
    In this paper, we propose a hybrid method for the spoken Tunisian dialect understanding within a limited task. This method couples a discriminative statistical method with a domain ontology. The statistical method is based on conditional random field (CRF) models learned from a little size corpus to perform conceptual labeling task. These models are able to detect the semantic dependency between words. However, the domain ontology is used to add prior knowledge about the task. Our experiments are based on a real spoken Tunisian dialect corpus. The obtained results show that the proposed method is able to improve the performance of CRF models for speech understanding by the integration of the domain ontology. Our method can be exploited for under-resourced languages and Arabic dialects to overcome the lack of linguistic resources .
  • Keywords
    natural language processing; ontologies (artificial intelligence); speech processing; statistical analysis; CRF models; conditional random field models; discriminative statistical method; domain ontology; knowledge base integration; real spoken Tunisian dialect corpus; robust speech understanding; semantic dependency; statistical framework; Biological system modeling; Hidden Markov models; Knowledge management; Ontologies; Semantics; Conditional random field (CRF); Tunisian dialect (TD); domain ontology; knowledge base; speech understanding; statistical models;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2015.2464687
  • Filename
    7208837