Title of article :
Machine learning approaches to diagnosis and laterality effects in semantic dementia discourse
Author/Authors :
Garrard، نويسنده , , Peter and Rentoumi، نويسنده , , Vassiliki and Gesierich، نويسنده , , Benno and Miller، نويسنده , , Bruce and Gorno-Tempini، نويسنده , , Maria Luisa، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2014
Pages :
8
From page :
122
To page :
129
Abstract :
Advances in automatic text classification have been necessitated by the rapid increase in the availability of digital documents. Machine learning (ML) algorithms can ‘learn’ from data: for instance a ML system can be trained on a set of features derived from written texts belonging to known categories, and learn to distinguish between them. Such a trained system can then be used to classify unseen texts. In this paper, we explore the potential of the technique to classify transcribed speech samples along clinical dimensions, using vocabulary data alone. We report the accuracy with which two related ML algorithms [naive Bayes Gaussian (NBG) and naive Bayes multinomial (NBM)] categorized picture descriptions produced by: 32 semantic dementia (SD) patients versus 10 healthy, age-matched controls; and SD patients with left- (n = 21) versus right-predominant (n = 11) patterns of temporal lobe atrophy. We used information gain (IG) to identify the vocabulary features that were most informative to each of these two distinctions. SD versus control classification task, both algorithms achieved accuracies of greater than 90%. In the right- versus left-temporal lobe predominant classification, NBM achieved a high level of accuracy (88%), but this was achieved by both NBM and NBG when the features used in the training set were restricted to those with high values of IG. The most informative features for the patient versus control task were low frequency content words, generic terms and components of metanarrative statements. For the right versus left task the number of informative lexical features was too small to support any specific inferences. An enriched feature set, including values derived from Quantitative Production Analysis (QPA) may shed further light on this little understood distinction.
Keywords :
Machine Learning , laterality , information gain , Discourse , Semantic dementia
Journal title :
Cortex
Serial Year :
2014
Journal title :
Cortex
Record number :
2301707
Link To Document :
بازگشت