DocumentCode
3306718
Title
Research on short text classification for web forum
Author
Xiaochun He ; Conghui Zhu ; Tiejun Zhao
Author_Institution
MOE-MS Key Lab. of Natural Language Process. & Speech, Harbin Inst. of Technol., Harbin, China
Volume
2
fYear
2011
fDate
26-28 July 2011
Firstpage
1052
Lastpage
1056
Abstract
The unique characteristic of short text makes short text classification quite different from traditional long text processing. The feature space of short text is so sparse, which makes it notoriously difficult to extract sufficient and effective features. In this paper, aiming to classify the short text on web forum accurately, a novel short-text-processing method based on semantic extension is introduced to enhance the content of the original short text, which effectively solves the problem of feature sparse. In addition, we put forward the concept of Key-Pattern (KP) and propose a new text feature representation approach based on KP, which extracts phrase with powerful semantic information as the text features. Traditional classifier model are applied to estimate the text´s classification, experimental results show that the proposed method is effective to improve the accuracy and recall of short text classification.
Keywords
Internet; feature extraction; pattern classification; text analysis; Web forum; classifier model; feature extraction; feature sparse problem; key-pattern concept; long text processing; semantic extension; short text classification; short-text-processing method; text feature representation approach; Classification algorithms; Feature extraction; Internet; Noise measurement; Semantics; Text categorization; Key-Pattern; Semantic extension; Short text classification; Text representation; Web forum;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-61284-180-9
Type
conf
DOI
10.1109/FSKD.2011.6019652
Filename
6019652
Link To Document