Title :
Measuring Relevance with Named Entity Based Patterns in Topic-Focused Document Summarization
Author :
Wei, Furu ; Li, Wenjie ; He, Yanxiang
Author_Institution :
Wuhan Univ., Wuhan
fDate :
Aug. 30 2007-Sept. 1 2007
Abstract :
In this paper, the role of named entity based patterns is emphasized in measuring the document sentences and topic relevance for topic-focused extractive summarization. Patterns are defined as the informative, semantic-sensitive text bi-grams consisting of at least one named entity or the semantic class of a named entity. They are extracted automatically according to eight pre-specified templates. Question types are also taken into consideration if they are available when dealing with topic questions. To alleviate problems with coverage, pattern and uni-gram models are integrated together to compensate each other in similarity calculation. Automatic ROUGE evaluations indicate that the proposed idea can produce a very good system that tops the best-performing system at Document Understanding Conference (DUC) 2005.
Keywords :
information retrieval; text analysis; information extraction; named entity based pattern; semantic class; text analysis; topic-focused document summarization; Computer science; Current measurement; Data mining; Measurement units; Tree graphs;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-1611-0
Electronic_ISBN :
978-1-4244-1611-0
DOI :
10.1109/NLPKE.2007.4368020