Language Feature Mining for Document Subjectivity Analysis

Author

Chen, Bo ; He, Hui ; Guo, Jun

fYear

2007

fDate

1-3 Nov. 2007

Firstpage

62

Lastpage

67

Abstract

In recent years, document sentiment analysis has attracted a great deal of research interest. One important aspect of this filed is the subjectivity analysis. This problem is different from traditional text categorization on that more linguistic or semantic information are required for better estimating the subjectivity of a document. Therefore, in this paper, focuses are on how to extract useful and meaningful language features and how to combine all of these language features efficiently. Under the well-known n- gram language model framework, we investigated a series of language-grams having different n-order and various distances to find the most important ones. In addition, we have also tried several weighting methods to make features more meaningful. Based on various kinds of language features, we adopted a tailored Maximum Entropy modeling method to construct our subjectivity classifier. Detailed experiments given in this paper show that the well extracted language features are suit for the document subjectivity analysis task.

Keywords

Classification tree analysis; Data mining; Entropy; Internet; Machine learning; Machine learning algorithms; Military computing; Motion pictures; Text analysis; Text categorization;

fLanguage

English

Publisher

ieee

Conference_Titel

Data, Privacy, and E-Commerce, 2007. ISDPE 2007. The First International Symposium on

Conference_Location

Chengdu

Print_ISBN

978-0-7695-3016-1

Type

conf

DOI

10.1109/ISDPE.2007.105

Filename

4402640