• DocumentCode
    655336
  • Title

    A Novel Technique for Efficient Text Document Summarization as a Service

  • Author

    Bagalkotkar, A. ; Kandelwal, A. ; Pandey, Shishir ; Kamath, S. Sowmya

  • Author_Institution
    Dept. of Inf. Technol., NITK, Surathkal, India
  • fYear
    2013
  • fDate
    29-31 Aug. 2013
  • Firstpage
    50
  • Lastpage
    53
  • Abstract
    Due to an exponential growth in the generation of web data, the need for tools and mechanisms for automatic summarization of Web documents has become very critical. Web data can be accessed from multiple sources, for e.g. on different Web pages, which makes searching for relevant pieces of information a difficult task. Therefore, an automatic summarizer is vital towards reducing human effort. Text summarization is an important activity in the analysis of a high volume text documents and is currently a major research topic in Natural Language Processing. It is the process of generation of the summary of an input document by extracting the representative sentences from it. In this paper, we present a novel technique for generating the summarization of domain specific text from a single Web document by using statistical NLP techniques on the text in a reference corpus and on the web document. The summarizer proposed generates a summary based on the calculated Sentence Weight (SW), the rank of a sentence in the document´s content, the number of terms and the number of words in a sentence, and using term frequency in the input corpus.
  • Keywords
    Internet; document handling; natural language processing; statistical analysis; text analysis; SW; Web data generation; Web pages; automatic Web document summarization; high volume text document analysis; natural language processing; reference corpus; representative sentence extraction; sentence weight; statistical NLP techniques; term frequency; text document summarization as a service technique; Data mining; Information technology; Length measurement; Natural language processing; Semantics; Simple object access protocol; Knowledge Extraction; Natural Language Processing; POS Tagging; Text Summarization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advances in Computing and Communications (ICACC), 2013 Third International Conference on
  • Conference_Location
    Cochin
  • Type

    conf

  • DOI
    10.1109/ICACC.2013.17
  • Filename
    6686335