• DocumentCode
    668817
  • Title

    Research and application of news-text similarity algorithm based on Chinese word segmentation

  • Author

    Wei Guan ; Pengzhou Zhang

  • Author_Institution
    New Media Inst., Commun. Univ. of China, Beijing, China
  • fYear
    2013
  • fDate
    20-22 Nov. 2013
  • Firstpage
    484
  • Lastpage
    487
  • Abstract
    With the rapid development of the Internet, text messages on the network is also an exponential growth. Facing the vast network of information, how to quickly and efficiently identify the different sites of similar news-text plays a major role in strengthening the integrated management of network information. Existing text similarity algorithm has many disadvantages when used in Chinese news-texts, we propose a more suitable and effective news-text similarity algorithm. This paper uses the Chinese word segmentation technology, and based on this kind of news-text similarity comparison and improved vector space model is applied to the algorithm. Experimental results show that the proposed method is superior to traditional methods the results obtained, thus proving the proposed Chinese news-text similarity calculation method.
  • Keywords
    electronic messaging; text analysis; vectors; word processing; Chinese news-texts; Chinese word segmentation technology; Internet; network information; news-text similarity algorithm; text messages; vector space model; Accuracy; Algorithm design and analysis; Computational modeling; Educational institutions; Internet; Semantics; Vectors; Chines text similarity; Chinese word segmentation; news-text; vector space model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Consumer Electronics, Communications and Networks (CECNet), 2013 3rd International Conference on
  • Conference_Location
    Xianning
  • Print_ISBN
    978-1-4799-2859-0
  • Type

    conf

  • DOI
    10.1109/CECNet.2013.6703375
  • Filename
    6703375