DocumentCode :
668817
Title :
Research and application of news-text similarity algorithm based on Chinese word segmentation
Author :
Wei Guan ; Pengzhou Zhang
Author_Institution :
New Media Inst., Commun. Univ. of China, Beijing, China
fYear :
2013
fDate :
20-22 Nov. 2013
Firstpage :
484
Lastpage :
487
Abstract :
With the rapid development of the Internet, text messages on the network is also an exponential growth. Facing the vast network of information, how to quickly and efficiently identify the different sites of similar news-text plays a major role in strengthening the integrated management of network information. Existing text similarity algorithm has many disadvantages when used in Chinese news-texts, we propose a more suitable and effective news-text similarity algorithm. This paper uses the Chinese word segmentation technology, and based on this kind of news-text similarity comparison and improved vector space model is applied to the algorithm. Experimental results show that the proposed method is superior to traditional methods the results obtained, thus proving the proposed Chinese news-text similarity calculation method.
Keywords :
electronic messaging; text analysis; vectors; word processing; Chinese news-texts; Chinese word segmentation technology; Internet; network information; news-text similarity algorithm; text messages; vector space model; Accuracy; Algorithm design and analysis; Computational modeling; Educational institutions; Internet; Semantics; Vectors; Chines text similarity; Chinese word segmentation; news-text; vector space model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Consumer Electronics, Communications and Networks (CECNet), 2013 3rd International Conference on
Conference_Location :
Xianning
Print_ISBN :
978-1-4799-2859-0
Type :
conf
DOI :
10.1109/CECNet.2013.6703375
Filename :
6703375
Link To Document :
بازگشت