Title :
Sentence Decomplexification using holistic aspect-based clause detection for long sentence understanding
Author :
Liu, Chao-Hong ; Wu, Chung-Hsien
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
fDate :
Nov. 29 2010-Dec. 3 2010
Abstract :
Long sentences have posed significant challenges for many natural language processing (NLP) tasks such as machine translation and language understanding, because it is still very difficult for the state-of-the-art parsers to analyze them. In this paper, we identify the Sentence Decomplexification (SD) problem and propose models for SD to help understand long sentences. Given a complex sentence, SD seeks to return two sentences, one main clause and the other subordinate clause. These two clauses together include all the information of the original sentence. Since identifying subordinate clauses is a more difficult task than traditional chunking, we also propose a holistic aspect-based detection (HAD) method for clause detection to reduce the overhead required for SD sentence similarity computation. We provide the formalisms of SD and show that HAD can be used for efficiency purposes to this task. The SD system was used to improve the performance of a long sentence understanding system. Experimental results show that the task of SD achieves 78.7% accuracy using Chinese Gigaword Corpus as sentence comparison corpus. For the performance of long sentence understanding, the proposed method reports an improvement of accuracy from 70.7% to 75.5% as compared to that without using SD.
Keywords :
language translation; natural language processing; string matching; text analysis; Chinese Gigaword Corpus; SD sentence similarity; clause detection; complex sentence; holistic aspect based clause detection; holistic aspect based detection method; long sentence understanding; machine translation; natural language processing; sentence decomplexification; subordinate clause; holistic aspect-based clause detection; language understanding; long sentence understanding; sentence decomplexification;
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
Conference_Location :
Tainan
Print_ISBN :
978-1-4244-6244-5
DOI :
10.1109/ISCSLP.2010.5684897