DocumentCode :
1793600
Title :
Guided summarization for Indonesian news articles
Author :
Massandy, Danang Tri ; Khodra, Masayu Leylia
Author_Institution :
Sch. of Electr. Eng. & Inf., Inst. Teknol. Bandung Bandung, Bandung, Indonesia
fYear :
2014
fDate :
20-21 Aug. 2014
Firstpage :
140
Lastpage :
145
Abstract :
The development of online news media grew in number in Indonesia. One technique of news articles summarization is guided summarization where the summary should contain important aspect information. Guided summarization techniques have been developed in the Text Analysis Conference (TAC) 2011 and one of the best methods is SWING by Jun-ping, et al. The purpose of this study is to adapt the methods of SWING system to Indonesian news articles as well as integrating with News Aggregator system. In this research, the experiments have purpose to determine the best features and system configuration when adapted to Indonesian news articles. ROUGE-2 and ROUGE-SU4 is used to evaluate the results of the summary where a summary of the system results compared to the human-made summaries. The best system configuration produces summary with evaluation of ROUGE-2 0,31 and ROUGE-SU4 0,22 which is very close to the human-made summaries with a value of ROUGE-2 0,32 and ROUGE-SU4 0.24. In addition, the update summarization component can be run by giving a summary of updates without repeating the information. Adaptation from SWING system to Indonesian news articles is employing features such as sentence length (SL), category relevance score (CRS), category KL-Divergence (CKLD), bigram DFS (BDFS), Top n NE corpus, Top n NE topic, quote sentence removal, and building SVR model for each news category.
Keywords :
information resources; text analysis; BDFS; CKLD; CRS; Indonesian news articles; ROUGE-2; ROUGE-SU4; SL; SVR model; SWING system; bigram DFS; category KL-divergence; category relevance score; guided summarization techniques; human-made summaries; news aggregator system; news articles summarization technique; online news media development; quote sentence removal; sentence length; top n NE corpus; top n NE topic; Accidents; Adaptation models; Feature extraction; Informatics; Predictive models; Redundancy; Training data; Indonesian; ROUGE; guided summarization; news aggregator; news articles;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Informatics: Concept, Theory and Application (ICAICTA), 2014 International Conference of
Conference_Location :
Bandung
Print_ISBN :
978-1-4799-6984-5
Type :
conf
DOI :
10.1109/ICAICTA.2014.7005930
Filename :
7005930
Link To Document :
بازگشت