DocumentCode :
652051
Title :
Objective Measurement of the Relationship between Variants in Classical Literature
Author :
Tachioka, Yuuki
Author_Institution :
Coll. of Humanities & Sci., Nihon Univ. Chiyoda-ku, Tokyo, Japan
fYear :
2013
fDate :
16-18 Sept. 2013
Firstpage :
202
Lastpage :
203
Abstract :
Stylometrics is a method of analyzing the style of a text using metric features. To apply this to classical literature, it is required that the diversity of variants of the same work is sufficiently smaller than that between different works because for the same work there are many variants, which have been changed from the original form. In this paper, this prerequisite will be confirmed, and the use of the Chinese character ratio, which is affected by the original form, Levenshtein distance, and perplexity is introduced. The experimental results show that the Chinese character ratio is effective for discriminating series of variants and that the Levenshtein distance and perplexity are also effective in addition to principal component analysis of features, which is general in Stylometrics. Especially, by using perplexity, the diversity between variants can be quantitatively compared in different works.
Keywords :
computational linguistics; natural language processing; principal component analysis; text analysis; Chinese character ratio; Levenshtein distance; Levenshtein perplexity; classical literature; metric feature; objective measurement; principal component analysis; stylometrics; Databases; Educational institutions; Electronic mail; Measurement; Principal component analysis; Tagging; Levenshtein distance; Stylometrics; perplexity; principal component analysis; variants of text;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Culture and Computing (Culture Computing), 2013 International Conference on
Conference_Location :
Kyoto
Type :
conf
DOI :
10.1109/CultureComputing.2013.65
Filename :
6680379
Link To Document :
بازگشت