Title :
Natural Language Watermarking Based on Syntactic Displacement and Morphological Division
Author :
Kim, Mi-Young ; Zaiane, Osmar R. ; Goebel, Randy
Author_Institution :
Dept. of Comput. Sci., Univ. of Alberta Edmonton, Edmonton, AB, Canada
Abstract :
This paper explores a method for Korean text watermarking based on a linguistic analysis scheme using morphemic and syntactic analysis. In this scheme, a predicate nominal is separated into its nominal and its predicate, and syntactic adverbial is displaced. Korean, as an agglutinative language, provides a good basis for this morpheme-based natural language watermarking because a word consists of several morphemes. A Korean word usually consists of a content morpheme and a function morpheme. However, a predicate nominal is an exception, having two content morphemes-nominal and predicate--and one function morpheme. So, we can divide a predicate nominal into a nominal and a predicate. In addition, we also perform syntax-based watermarking. We displace syntactic adverbials using the characteristic that most languages permit displacement of syntactic adverbials within its clause. Combining these morphemic and syntactic characteristics, we propose a method of language watermarking based on syntactic displacement and morphological division. To make our system more secure, we also include a sentence weight value and encode the weight value with a watermark bit. Our watermarking method doesn´t change the meaning of the most marked sentences, and it also ensures the naturalness of the sentences. From the experimental results, we show that the rate of unnatural sentences of marked text is reasonable, and the watermarking capacity is better than previous systems. The coverage of marked sentences is also reasonable. Experimental results also show that the marked text retains the same style, and also has the same information without semantic distortion.
Keywords :
computational linguistics; natural language processing; text analysis; watermarking; Korean text watermarking; agglutinative language; linguistic analysis scheme; morpheme-based natural language watermarking; morphemic analysis; morphological division; semantic distortion; syntactic adverbial; syntactic analysis; syntactic displacement;
Conference_Titel :
Computer Software and Applications Conference Workshops (COMPSACW), 2010 IEEE 34th Annual
Conference_Location :
Seoul
Print_ISBN :
978-1-4244-8089-0
Electronic_ISBN :
978-0-7695-4105-1
DOI :
10.1109/COMPSACW.2010.37