DocumentCode
668817
Title
Research and application of news-text similarity algorithm based on Chinese word segmentation
Author
Wei Guan ; Pengzhou Zhang
Author_Institution
New Media Inst., Commun. Univ. of China, Beijing, China
fYear
2013
fDate
20-22 Nov. 2013
Firstpage
484
Lastpage
487
Abstract
With the rapid development of the Internet, text messages on the network is also an exponential growth. Facing the vast network of information, how to quickly and efficiently identify the different sites of similar news-text plays a major role in strengthening the integrated management of network information. Existing text similarity algorithm has many disadvantages when used in Chinese news-texts, we propose a more suitable and effective news-text similarity algorithm. This paper uses the Chinese word segmentation technology, and based on this kind of news-text similarity comparison and improved vector space model is applied to the algorithm. Experimental results show that the proposed method is superior to traditional methods the results obtained, thus proving the proposed Chinese news-text similarity calculation method.
Keywords
electronic messaging; text analysis; vectors; word processing; Chinese news-texts; Chinese word segmentation technology; Internet; network information; news-text similarity algorithm; text messages; vector space model; Accuracy; Algorithm design and analysis; Computational modeling; Educational institutions; Internet; Semantics; Vectors; Chines text similarity; Chinese word segmentation; news-text; vector space model;
fLanguage
English
Publisher
ieee
Conference_Titel
Consumer Electronics, Communications and Networks (CECNet), 2013 3rd International Conference on
Conference_Location
Xianning
Print_ISBN
978-1-4799-2859-0
Type
conf
DOI
10.1109/CECNet.2013.6703375
Filename
6703375
Link To Document