A language model for parsing very long Chinese sentences

Author

Chen, Hsin-Hsi

Author_Institution

Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ., Taipei, Taiwan

fYear

1993

fDate

8-11 Nov 1993

Firstpage

290

Lastpage

297

Abstract

By corpus analyses, about 75% of Chinese sentences are composed of more than two sentence segments separated by commas or semicolons. A segment may be a sentence, a noun phrase, a verb phrase, an adjective phrase, an adverbial phrase, or a prepositional phrase. An NP segment may serve as a subject of the next segment or an object of the previous segment. The empty category pro may also appear in the VP segment. The maximal freedom of the uses of pros, the large number of segments, the various segment types, and the associativity problem make sentence parsing difficult. Few parsing systems deal with these problems. The authors regard a segment as a basic parsing unit. It also uses characteristic words, subcategories of verbs, topic chains and some heuristic rules to link the segments into meaningful units. The pro resolution and segment linking are useful for practical applications

Keywords

computational linguistics; natural languages; NP segment; VP segment; associativity; language model; long Chinese sentences; sentence parsing; sentence segments; Computer science; Couplings; Information analysis; Joining processes; Natural language processing; Natural languages; Particle separators;

fLanguage

English

Publisher

ieee

Conference_Titel

Tools with Artificial Intelligence, 1993. TAI '93. Proceedings., Fifth International Conference on

Conference_Location

Boston, MA

ISSN

1063-6730

Print_ISBN

0-8186-4200-9

Type

conf

DOI

10.1109/TAI.1993.633970

Filename

633970