Title :
A Comparative Study on Feature Window Selection in Text Filtering
Author :
Quan, Hu ; Fang, Xie ; Xiaoguang, Liu
Author_Institution :
Coll. of Phys. Sci. & Technol., Huazhong Normal Univ., Wuhan, China
Abstract :
Text representation is a preliminary step to text filtering, while VSM is the most commonly used method in this field. However, the document feature set, which produced by VSM, usually has a very high dimensionality. As a result, the distribution of feature value tends to be highly skewed. In this paper some new mechanisms are presented to abate such problems. Using these mechanisms, document features are extracted from some smaller feature windows rather than a full text, such as sentences, graphs and blocks, and the correlative texts are finally evaluated by local similarity. They are gotten by the analysis of documentpsilas linguistics structures in documents. As a result, it can give a remarkable effect on the precision of text filtering.
Keywords :
information filtering; text analysis; correlative texts; document feature set; document linguistics structures; feature window selection; text filtering; text representation; Application software; Collaboration; Computer science; Educational institutions; Feature extraction; Frequency; Information filtering; Information filters; Information technology; Matched filters; feature vector; feature window; matching algorithm; text filtering;
Conference_Titel :
Information Technology and Applications, 2009. IFITA '09. International Forum on
Conference_Location :
Chengdu
Print_ISBN :
978-0-7695-3600-2
DOI :
10.1109/IFITA.2009.189