DocumentCode
3253109
Title
Information retrieval using Hellinger distance and sqrt-cos similarity
Author
Shunzhi Zhu ; Lizhao Liu ; Yan Wang
Author_Institution
Dept. of Comput. Sci. & Technol., Xiamen Univ. of Technol., Xiamen, China
fYear
2012
fDate
14-17 July 2012
Firstpage
925
Lastpage
929
Abstract
In this paper, we propose a similarity measurement method based on the Hellinger distance and square-root cosine. Then use Hellinger distance as the distance metric for document clustering and a new square-root cosine similarity for query information retrieval. This new similarity/distance also bridges between traditional tf_idf weighting to binary weighting in vector space model. Finally, we conduct a comparison on performance between this method and the one based on Euclidean distance and cosine similarity. And from the results, we clearly observe that the precision and recall are improved by using the sqrt-cos similarity.
Keywords
document handling; query processing; vectors; Euclidean distance; binary weighting; distance metric; document clustering; information Hellinger distance; information retrieval; precision improvement; query information retrieval; recall improvement; square-root cosine similarity measurement method; tf-idf weighting; vector space model; Computational modeling; Educational institutions; Euclidean distance; Information retrieval; Probabilistic logic; Vectors; Hellinger cosine measurement; Hellinger distance; document clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science & Education (ICCSE), 2012 7th International Conference on
Conference_Location
Melbourne, VIC
Print_ISBN
978-1-4673-0241-8
Type
conf
DOI
10.1109/ICCSE.2012.6295217
Filename
6295217
Link To Document