DocumentCode
724196
Title
Keyword extraction for web news documents based on LM-BP neural network
Author
Xiaohui Liu ; Xin Yan ; Zhengtao Yu ; Guangshun Qin ; Yuanyuan Mo
Author_Institution
Sch. of Inf. Eng. & Autom., Kunming Univ. of Sci. & Technol., Kunming, China
fYear
2015
fDate
23-25 May 2015
Firstpage
2525
Lastpage
2531
Abstract
In view of the actual demand, the paper provides a new idea on keyword extraction for web news documents by adopting the improved LM algorithm based on BP artificial neural network. First, preprocess the web news documents which are of consistent HTML format. The preprocessed work includes noise filter, web content extraction, word segmentation, POS tagging, stop words removal, etc. Also, select effective features like TF, location of words based on the characteristics of news documents. Then the selected features will be considered in training and constructing the BP neural network. Finally, extract keywords with LM algorithm which has parameters adjustment and solves training too long and getting stuck in local minimum of BP so that improve network convergence speed and keyword classification performance. The results show that LM algorithm has better effect and convergence performance comparing with BP in the field of keyword extraction.
Keywords
Internet; backpropagation; feature extraction; feature selection; neural nets; statistical analysis; text analysis; word processing; HTML format; LM-BP neural network; Levenberg-Marquardt algorithm; POS tagging; Web content extraction; Web news document; feature selection; keyword extraction; machine learning; noise filter; statistics-based method; stop words removal; word segmentation; Approximation algorithms; Biological neural networks; Classification algorithms; Convergence; Feature extraction; Training; BP neural network; Keyword Extraction; Levenberg-Marquardt Algorithm;
fLanguage
English
Publisher
ieee
Conference_Titel
Control and Decision Conference (CCDC), 2015 27th Chinese
Conference_Location
Qingdao
Print_ISBN
978-1-4799-7016-2
Type
conf
DOI
10.1109/CCDC.2015.7162346
Filename
7162346
Link To Document