Title :
A language model for parsing very long Chinese sentences
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ., Taipei, Taiwan
Abstract :
By corpus analyses, about 75% of Chinese sentences are composed of more than two sentence segments separated by commas or semicolons. A segment may be a sentence, a noun phrase, a verb phrase, an adjective phrase, an adverbial phrase, or a prepositional phrase. An NP segment may serve as a subject of the next segment or an object of the previous segment. The empty category pro may also appear in the VP segment. The maximal freedom of the uses of pros, the large number of segments, the various segment types, and the associativity problem make sentence parsing difficult. Few parsing systems deal with these problems. The authors regard a segment as a basic parsing unit. It also uses characteristic words, subcategories of verbs, topic chains and some heuristic rules to link the segments into meaningful units. The pro resolution and segment linking are useful for practical applications
Keywords :
computational linguistics; natural languages; NP segment; VP segment; associativity; language model; long Chinese sentences; sentence parsing; sentence segments; Computer science; Couplings; Information analysis; Joining processes; Natural language processing; Natural languages; Particle separators;
Conference_Titel :
Tools with Artificial Intelligence, 1993. TAI '93. Proceedings., Fifth International Conference on
Conference_Location :
Boston, MA
Print_ISBN :
0-8186-4200-9
DOI :
10.1109/TAI.1993.633970