مرکز منطقه ای اطلاع رساني علوم و فناوري - Inferring clinical depression from speech and spoken utterances

DocumentCode :

155616

Title :

Inferring clinical depression from speech and spoken utterances

Author :

Asgari, M. ; Shafran, Izhak ; Sheeber, Lisa B.

Author_Institution :

Center for Spoken Language Understanding, Oregon Health & Sci. Univ., Portland, OR, USA

fYear :

2014

fDate :

21-24 Sept. 2014

Firstpage :

Lastpage :

Abstract :

In this paper, we investigate the problem of detecting depression from recordings of subjects´ speech using speech processing and machine learning. There has been considerable interest in this problem in recent years due to the potential for developing objective assessments from real-world behaviors, which may provide valuable supplementary clinical information or may be useful in screening. The cues for depression may be present in “what is said” (content) and “how it is said” (prosody). Given the limited amounts of text data, even in this relatively large study, it is difficult to employ standard method of learning models from n-gram features. Instead, we learn models using word representations in an alternative feature space of valence and arousal. This is akin to embedding words into a real vector space albeit with manual ratings instead of those learned with deep neural networks [1]. For extracting prosody, we employ standard feature extractors such as those implemented in openSMILE and compare them with features extracted from harmonic models that we have been developing in recent years. Our experiments show that our features from harmonic model improve the performance of detecting depression from spoken utterances than other alternatives. The context features provide additional improvements to achieve an accuracy of about 74%, sufficient to be useful in screening applications.

Keywords :

learning (artificial intelligence); neural nets; psychology; speech processing; clinical depression inferrence; feature extraction; harmonic models; learning models; machine learning; neural networks; openSMILE; speech processing; speech utterances; spoken utterances; word representations; Abstracts; Speech; Depression; Speech analysis; Telemedicine;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Machine Learning for Signal Processing (MLSP), 2014 IEEE International Workshop on

Conference_Location :

Reims

Type :

conf

DOI :

10.1109/MLSP.2014.6958856

Filename :

6958856

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=155616