Title :
Dynamic presentation of phrasally-based document abstractions
Author :
Boguraev, B. ; Bellamy, R. ; Kennedy, C.
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
Summarisation technologies today work, in essence, by performing data reduction over the original document source. Document fragments, identified as particularly representative of content, are extracted and offered to the user. Typically, such fragments are sentence-sized, and the summary is nothing more than a concatenation of these sentences. We argue that, for content characterisation, phrasal units with certain discourse properties are more representative than sentences. From such a position, we outline a model of document content abstraction based on a notion of topically prominent topic stamps. For such abstractions to be useful, they need to retain contextual highlights of their occurrences in the documents; to be usable, they further need to be able to function as windows into the full documents, with suitably designed interfaces for navigation into areas of particular interest. This paper proposes a method for contextualizing document highlights, relates this to our model of salience-based content characterization, and demonstrates how the document abstractions derived from such principles facilitate dynamic document content presentation. We argue that dynamic document abstractions effectively mediate different levels of granularity analysis, from terse document highlights to fully contextualized foci of particular interest. We close by describing a range of dynamic document viewers which embody novel presentation metaphors for the delivery of document content.
Keywords :
abstracting; data reduction; text analysis; content characterisation; content-representative document fragments; contextual highlights; data reduction; discourse properties; document content abstraction; dynamic document content presentation; dynamic document viewers; fully contextualized foci; granularity analysis; navigational interfaces; phrasal units; phrasally-based document abstractions; salience-based content characterization; sentence concatenation; summarisation; terse document highlights; topically prominent topic stamps; Abstracts; Data mining; Dynamic range; Machinery; Natural language processing; Navigation; Performance analysis; Read only memory;
Conference_Titel :
Systems Sciences, 1999. HICSS-32. Proceedings of the 32nd Annual Hawaii International Conference on
Conference_Location :
Maui, HI, USA
Print_ISBN :
0-7695-0001-3
DOI :
10.1109/HICSS.1999.772684