• DocumentCode
    1662860
  • Title

    A speech emotion recognition framework based on latent Dirichlet allocation: Algorithm and FPGA implementation

  • Author

    Shah, Mubarak ; Lifeng Miao ; Chakrabarti, Chaitali ; Spanias, A.

  • Author_Institution
    Sch. of Electr., Arizona State Univ., Tempe, AZ, USA
  • fYear
    2013
  • Firstpage
    2553
  • Lastpage
    2557
  • Abstract
    In this paper, we present a speech-based emotion recognition framework based on a latent Dirichlet allocation model. This method assumes that incoming speech frames are conditionally independent and exchangeable. While this leads to a loss of temporal structure, it is able to capture significant statistical information between frames. In contrast, a hidden Markov model-based approach captures the temporal structure in speech. Using the German emotional speech database EMO-DB for evaluation, we achieve an average classification accuracy of 80.7% compared to 73% for hidden Markov models. This improvement is achieved at the cost of a slight increase in computational complexity. We map the proposed algorithm onto an FPGA platform and show that emotions in a speech utterance of duration 1.5s can be identified in 1.8ms, while utilizing 70% of the resources. This further demonstrates the suitability of our approach for real-time applications on hand-held devices.
  • Keywords
    computational complexity; emotion recognition; field programmable gate arrays; hidden Markov models; speech recognition; EMO-DB; FPGA platform; German emotional speech database; computational complexity; hand-held devices; hidden Markov models; latent Dirichlet allocation model; speech emotion recognition framework; speech utterance; statistical information; temporal structure; time 1.5 s; time 1.8 ms; Abstracts; Complexity theory; Educational institutions; Emotion recognition; Speech; Speech recognition; FPGA implementation; affective computing; emotion recognition; latent Dirichlet allocation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6638116
  • Filename
    6638116