DocumentCode :
185797
Title :
Internet information source discovery based on multi-seeds cocitation
Author :
Gao Hui ; Niu Haibo ; Luo Wei
Author_Institution :
China Defense Sci. &Technol. Inf. Center, Beijing, China
fYear :
2014
fDate :
18-19 Oct. 2014
Firstpage :
368
Lastpage :
371
Abstract :
The technology of Internet information source discovery on specific topic is the groundwork of information acquisition in current big data era. This paper presents a multi-seeds cocitation algorithm to find new Internet information sources. The proposed algorithm is based on cocitation, but what difference with the traditional algorithms is that we use multiple websites on specific topic as input seeds. Then we induce Combined Cocitation Degree(CCD) to measure the relevancy of newly found websites, which is that the new websites have higher combined cocitation degree and are more topic related. Finally a websites collection of the biggest CCD is referred to as the new Internet information sources on the specific topic. The experiments show that the proposed method outperforms traditional algorithms in the scenarios we tested.
Keywords :
Big Data; Internet; Web sites; citation analysis; data mining; Big Data; CCD; Internet information source discovery; Web sites; combined cocitation degree; information acquisition; multiseeds cocitation; relevancy measurement; Algorithm design and analysis; Big data; Charge coupled devices; Google; Internet; Noise; Web pages; big data; cocitation; information source; related website;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Security, Pattern Analysis, and Cybernetics (SPAC), 2014 International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4799-5352-3
Type :
conf
DOI :
10.1109/SPAC.2014.6982717
Filename :
6982717
Link To Document :
بازگشت