DocumentCode :
3645148
Title :
Mining textual significant expressions reflecting opinions in natural languages
Author :
Jan Žižka;František Dařena
Author_Institution :
Department of Informatics / SoNet Research Center, Mendel University in Brno, Brno, Czech Republic
fYear :
2011
Firstpage :
136
Lastpage :
141
Abstract :
Revealing an opinion hidden in a text document is a challenging task. The article presents a method based on the automatic extraction of expressions that are significant for specifying a document attitude to a given topic. The significant expressions are composed using revealed significant words in the documents. The significant words are selected by the c5 decision-tree generator based on the entropy minimization. Words included in branches represent kernels of the significant expressions. The full expressions are composed of the significant words and words surrounding them in the original documents. Such expressions provide much more information than individual (key-)words and can be used for analysing a document meaning and the cause of the opinion: what exactly the opinion deals with? The results are demonstrated using large real-world multilingual data representing customers´ opinions written in a free form.
Keywords :
"Entropy","Natural languages","Intelligent systems","Internet","Accuracy","Decision trees","Kernel"
Publisher :
ieee
Conference_Titel :
Intelligent Systems Design and Applications (ISDA), 2011 11th International Conference on
ISSN :
2164-7143
Print_ISBN :
978-1-4577-1676-8
Electronic_ISBN :
2164-7151
Type :
conf
DOI :
10.1109/ISDA.2011.6121644
Filename :
6121644
Link To Document :
بازگشت