Title of article :
A metric to search for relevant words
Author/Authors :
Hongding Zhou، نويسنده , , Gary W. Slater، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2003
Abstract :
We propose a new metric to evaluate and rank the relevance of words in a text. The method uses the density fluctuations of a word to compute an index that measures its degree of clustering. Highly significant words tend to form clusters, while common words are essentially uniformly spread in a text. If a word is not rare, the metric is stable when we move any individual occurrence of this word in the text. Furthermore, we prove that the metric always increases when words are moved to form larger clusters, or when several independent documents are merged. Using the Holy Bible as an example, we show that our approach reduces the significance of common words when compared to a recently proposed statistical metric.
Journal title :
Physica A Statistical Mechanics and its Applications
Journal title :
Physica A Statistical Mechanics and its Applications