DocumentCode :
3776091
Title :
Automatic Bengali news documents summarization by introducing sentence frequency and clustering
Author :
Md. Majharul Haque;Suraiya Pervin;Zerina Begum
Author_Institution :
Department of Computer Science & Engineering, University of Dhaka, Dhaka-1000, Bangladesh
fYear :
2015
Firstpage :
156
Lastpage :
160
Abstract :
A method has been proposed in this paper for Bengali news documents summarization which extracts significant sentences using the four major steps (a) preprocessing, (b) sentence ranking, (c) sentence clustering, and (d) summary generation. The noticeable feature of this method is the incorporation of the sentence frequency where redundancy elimination is a consequence. Another one remarkable aspect is sentence clustering on the basis of similarity ratio among sentences. The summary sentence selection is done from all the clusters so that there will be maximum coverage of information in summary even if information is found scattered in input document. Two sets of human generated summary have been utilized where one is to train the system and another is for performance evaluation. The proposed method has been found better while turning comparison with the latest state-of-the art method of Bengali news documents summarization. The results of performance evaluation show that the average Precision, Recall and F-measure values are 0.608, 0.664 and 0.632 respectively.
Keywords :
"Computer science","Information technology","Electronic mail","Redundancy","Performance evaluation","Art","Internet"
Publisher :
ieee
Conference_Titel :
Computer and Information Technology (ICCIT), 2015 18th International Conference on
Type :
conf
DOI :
10.1109/ICCITechn.2015.7488060
Filename :
7488060
Link To Document :
بازگشت