DocumentCode
2332817
Title
Using complex linguistic features in context-sensitive text classification techniques
Author
Wong, Alex K S ; Lee, John W T ; Yeung, Daniel S.
Author_Institution
Dept. of Comput., Hong Kong Polytech. Univ., China
Volume
5
fYear
2005
fDate
18-21 Aug. 2005
Firstpage
3183
Abstract
Text classification (TC) is the task to automatically classify documents based on learned document features. Many popular TC models use simple occurrence of words in a document as features. They also commonly assume word occurrences to be statistically independent in their design. Although it is obvious that such assumption does not hold in general, these TC models have been robust and efficient in their task. Some recent studies have shown context-sensitive TC approaches, which take into consideration contexts in the form of word co-occurrences, have been able to perform better in general. On the other hand, there have been many studies in the use of complex linguistic or semantic features instead of simple word occurrences as features for information retrieval and classification tasks. While these complex features may intuitively have more relevance to the tasks concerned, results of these studies on their effectiveness have been mixed and not been conclusive. In this paper we present our investigation on the use of some complex linguistic features with context-sensitive TC method. Our experiment results show some potential advantages of such approach.
Keywords
classification; computational linguistics; context-sensitive languages; text analysis; automatic document classification; complex linguistic feature; context-sensitive text classification; learned document feature; semantic feature; word occurrence; Cybernetics; Electronic mail; Feature extraction; Information retrieval; Machine learning; Machine learning algorithms; Robustness; Text categorization; Text processing; Text classification; complex linguistics feature; context-sensitive; semantics;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location
Guangzhou, China
Print_ISBN
0-7803-9091-1
Type
conf
DOI
10.1109/ICMLC.2005.1527491
Filename
1527491
Link To Document