Title :
Clustering and Similarity Analysis of File Names in EDonkey Systems
Author :
Zhang, Min ; Chen, Changjia ; Wei Yan
Author_Institution :
Sch. of Electron. & Inf. Eng., Beijing Jiaotong Univ., Beijing
Abstract :
eDonkey has become the most reliable and popular P2P file sharing software with largest users. But currently it is still a problem that how to find resources you need exactly and quickly. The traditional searching method which is based on keywords cannot fulfill the needs of people. The effective and promising method is semantic searching. Semantic-based study has become one of hot spots in the nature language processing field. After collecting the sharing file names of eDonkey systems, we pay our attention on the analysis of words of file names. We adopt two methods in the paper to character these words, one is the clustering analysis method and the other is similarity analysis method. Through careful study some conclusions are drawn using clustering and similarity methods. Words are successfully clustered into 5 groups. Constructing the semantic web through clusters and similarity can be used in search engines in P2P systems.
Keywords :
information retrieval; peer-to-peer computing; search engines; semantic Web; P2P file sharing software; P2P systems; clustering analysis; eDonkey systems; nature language processing field; search engines; semantic Web; similarity analysis; traditional searching method; Application software; File servers; Information analysis; Information technology; Peer to peer computing; Reliability engineering; Search engines; Search methods; Semantic Web; Vocabulary; clustering; edonkey; p2p; similarity;
Conference_Titel :
Intelligent Information Technology Application Workshops, 2008. IITAW '08. International Symposium on
Conference_Location :
Shanghai
Print_ISBN :
978-0-7695-3505-0
DOI :
10.1109/IITA.Workshops.2008.26