Title :
An Automatic Linguistics Approach for Persian Document Summarization
Author :
Kamyar, Hossein ; Kahani, Mohsen ; Kamyar, Mohsen ; Poormasoomi, Asef
Author_Institution :
Web Technol. Lab., Ferdowsi Univ. of Mashhad, Mashhad, Iran
Abstract :
In this paper we propose a novel technique for summarizing a text based on the linguistics properties of text elements and semantic chains among them. In most summarization approaches, the major consideration is the statistical properties of text elements such as term frequency. Here we use centering theory which helps us to recognize semantic chains in a text, for proposing a new automatic single document summarization approach. For processing a text by centering theory and extracting a coherent summery, a processing pipeline should be constructed. This pipeline consists of several components such as co-reference resolution, semantic role labeling and POS [Part of speech] tagging.
Keywords :
computational linguistics; natural language processing; pipeline processing; statistical analysis; text analysis; POS tagging; Persian document summarization approach; automatic linguistics approach; co-reference resolution; pipeline processing; semantic chain recognition; semantic role labeling; statistical properties; text element linguistics properties; Coherence; Data mining; Educational institutions; Humans; Pragmatics; Semantics; Text recognition; Centering Theory; Extractive; LSI; Persian; Single-document summarization;
Conference_Titel :
Asian Language Processing (IALP), 2011 International Conference on
Conference_Location :
Penang
Print_ISBN :
978-1-4577-1733-8
DOI :
10.1109/IALP.2011.52