DocumentCode
3563719
Title
Order estimation of Japanese paragraphs by supervised machine learning
Author
Ito, Satoshi ; Murata, Masaki ; Tokuhisa, Masato ; Qing Ma
Author_Institution
Dept. of Inf. & Electron., Tottori Univ., Tottori, Japan
fYear
2014
Firstpage
1096
Lastpage
1101
Abstract
In this paper, we propose a method to estimate the order of paragraphs by supervised machine learning. We use a support vector machine (SVM) for supervised machine learning. The estimation of paragraph order is useful for sentence generation and sentence correction. The proposed method obtained a high accuracy (0.86) in the order estimation experiments of the first two paragraphs of an article and achieved the same accuracy as manual estimation. In addition, it obtained a higher accuracy than the baseline methods in the experiments using two paragraphs of an article. We performed feature analysis and we found that adnominals, conjunctions, and dates were effective for the order estimation of the first two paragraphs, and the ratio of new words and the similarity between the preceding paragraphs and an estimated paragraph were effective for the order estimation of all pairs of paragraphs. Moreover, we compared the order estimation of sentences and paragraphs and clarified differences. For the order estimation of the first two paragraphs, paragraph order estimation would be easier than sentence order estimation because paragraphs have more information than sentences. For the order estimation of all pairs of paragraphs, paragraph order estimation would be more difficult than sentence order estimation because a story may conclude in a paragraph.
Keywords
learning (artificial intelligence); natural language processing; support vector machines; text analysis; word processing; Japanese paragraphs order estimation; adnominals; conjunctions; dates; feature analysis; sentences order estimation; supervised machine learning; support vector machine; words ratio; Accuracy; Estimation; Manuals; Speech; Supervised learning; Support vector machines; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Soft Computing and Intelligent Systems (SCIS), 2014 Joint 7th International Conference on and Advanced Intelligent Systems (ISIS), 15th International Symposium on
Type
conf
DOI
10.1109/SCIS-ISIS.2014.7044697
Filename
7044697
Link To Document