Anti-serendipity: finding useless documents and similar documents

Author

Cooper, James W. ; Prager, John M.

Author_Institution

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

fYear

2000

fDate

4-7 Jan. 2000

Abstract

The problem of finding your way through a relatively unknown collection of digital documents can be daunting. Such collections sometimes have few categories and little hierarchy, or they have so much hierarchy that valuable relations between documents can easily become obscured. We describe here how our work in the area of term-recognition and sentence-based summarization can be used to filter the document lists that we return from searches. We can thus remove or downgrade the ranking of some documents that have limited utility even though they may match many of the search terms fairly accurately. We also describe how we can use this same system to find documents that are closely related to a document of interest, thus continuing our work to provide tools for query-free searching.

Keywords

information retrieval; anti-serendipity; query-free searching; sentence-based summarization; similar documents; term-recognition; useless documents; Computer interfaces; Displays; Feedback; Filters; Performance analysis; Search engines; Statistics; Text analysis; Thesauri;

fLanguage

English

Publisher

ieee

Conference_Titel

System Sciences, 2000. Proceedings of the 33rd Annual Hawaii International Conference on

Print_ISBN

0-7695-0493-0

Type

conf

DOI

10.1109/HICSS.2000.926691

Filename

926691

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3145490