Title of article :
Blended metrics for novel sentence mining
Author/Authors :
Tang، نويسنده , , Wenyin and Tsai، نويسنده , , Flora S. and Chen، نويسنده , , Lihui، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2010
Pages :
6
From page :
5172
To page :
5177
Abstract :
With the abundance of raw text documents available on the internet, many articles contain redundant information. Novel sentence mining can discover novel, yet relevant, sentences given a specific topic defined by a user. In real-time novelty mining, an important issue is to how to select a suitable novelty metric that quantitatively measures the novelty of a particular sentence. To utilize the merits of different metrics, a blended metric is proposed by combining both cosine similarity and new word count metrics. The blended metric has been tested on TREC 2003 and TREC 2004 Novelty Track data. The experimental results show that the blended metric can perform generally better on topics with different ratios of novelty, which is useful for real-time novelty mining in topics with varying degrees of novelty.
Keywords :
novelty detection , Cosine similarity , Blended metric , Novel sentence mining , Text Mining , New word count
Journal title :
Expert Systems with Applications
Serial Year :
2010
Journal title :
Expert Systems with Applications
Record number :
2348100
Link To Document :
بازگشت