DocumentCode
655336
Title
A Novel Technique for Efficient Text Document Summarization as a Service
Author
Bagalkotkar, A. ; Kandelwal, A. ; Pandey, Shishir ; Kamath, S. Sowmya
Author_Institution
Dept. of Inf. Technol., NITK, Surathkal, India
fYear
2013
fDate
29-31 Aug. 2013
Firstpage
50
Lastpage
53
Abstract
Due to an exponential growth in the generation of web data, the need for tools and mechanisms for automatic summarization of Web documents has become very critical. Web data can be accessed from multiple sources, for e.g. on different Web pages, which makes searching for relevant pieces of information a difficult task. Therefore, an automatic summarizer is vital towards reducing human effort. Text summarization is an important activity in the analysis of a high volume text documents and is currently a major research topic in Natural Language Processing. It is the process of generation of the summary of an input document by extracting the representative sentences from it. In this paper, we present a novel technique for generating the summarization of domain specific text from a single Web document by using statistical NLP techniques on the text in a reference corpus and on the web document. The summarizer proposed generates a summary based on the calculated Sentence Weight (SW), the rank of a sentence in the document´s content, the number of terms and the number of words in a sentence, and using term frequency in the input corpus.
Keywords
Internet; document handling; natural language processing; statistical analysis; text analysis; SW; Web data generation; Web pages; automatic Web document summarization; high volume text document analysis; natural language processing; reference corpus; representative sentence extraction; sentence weight; statistical NLP techniques; term frequency; text document summarization as a service technique; Data mining; Information technology; Length measurement; Natural language processing; Semantics; Simple object access protocol; Knowledge Extraction; Natural Language Processing; POS Tagging; Text Summarization;
fLanguage
English
Publisher
ieee
Conference_Titel
Advances in Computing and Communications (ICACC), 2013 Third International Conference on
Conference_Location
Cochin
Type
conf
DOI
10.1109/ICACC.2013.17
Filename
6686335
Link To Document