Title :
Building a Language Model for Local Coherence in Multi-document Summaries Using a Discourse-Enriched Entity-Based Model
Author :
Castro Jorge, Maria LuciÌa R. ; Dias, MaÌrcio S. ; Pardo, Thiago A. S.
Author_Institution :
Interinstitutional Center for Comput. Linguistics (NILC) Inst. of Math. & Comput. Sci., Univ. of Sao Paulo, Sao Paulo, Brazil
Abstract :
Local Coherence is a very important aspect in multi-document summarization, since good summaries not only condense the most relevant information, but also present it in a well-organized structure. One of the most investigated models for local coherence is the Entity-based model, which has been successfully used, once it facilitates the computational approach for coherence measurement. Particularly, this model was used for the evaluation of local coherence in multi-document summaries, achieving promising results. In order to improve the potential of the Entity-based model, we propose the creation of a language model for multi-document summaries that integrates the Entity-based model with discourse knowledge, mainly from Cross-document Structure Theory. Our results show that this type of information enriches the Entity-based Model by capturing other phenomena that are inherent to multi-document summaries, such as redundancy and complementarily, which improves the performance of the original model.
Keywords :
computational linguistics; document handling; information retrieval; coherence measurement; cross-document structure theory; discourse-enriched entity-based model; language model; local coherence; multidocument summarization; redundancy; Accuracy; Buildings; Coherence; Computational modeling; Proposals; Redundancy; Vectors; discourse models; entity-based model; multi-document summarization;
Conference_Titel :
Intelligent Systems (BRACIS), 2014 Brazilian Conference on
Conference_Location :
Sao Paulo
DOI :
10.1109/BRACIS.2014.19