Title :
Research on query-based automatic summarization of webpage
Author :
Chen, Zhimin ; Shen, Jie
Author_Institution :
Inst. of Inf. Eng., Yangzhou Univ., Yangzhou, China
Abstract :
In order to overcome the shortcomings of traditional automatic summarization technologies, this paper proposes a novel automatic method for Webpage summarization based on user´s query. This method can be divided into the following steps. First, a HTML document is segmented into different topic blocks. Second, the weight of every sentence in a topic block is calculated in view of query keywords. Finally, several important sentences are dynamically extracted to compose the digest according to expected compression ratio with the improved maximal marginal relevance method, which can remove redundant information in summary. The experiment results show that the proposed method can increase the digest coverage to the original document and improve the performance of information retrieval efficiently.
Keywords :
Web sites; information analysis; query processing; HTML document; information retrieval; maximal marginal relevance; query keywords; query-based automatic Webpage summarization; Automatic control; Communication system control; Data mining; Engineering management; HTML; Information retrieval; Search engines; Technology management; Vocabulary; Web search; Maximal Marginal Relevance; automatic summarization; query-based; topic segmentation;
Conference_Titel :
Computing, Communication, Control, and Management, 2009. CCCM 2009. ISECS International Colloquium on
Conference_Location :
Sanya
Print_ISBN :
978-1-4244-4247-8
DOI :
10.1109/CCCM.2009.5270475