Title :
Segmentation and tagging of the Shang Oracle-Bone Inscriptions based on lucene and dictionary
Author :
Jin-yu, Kai ; Tian-lin, Wang ; Yong-ge, Liu ; Hong-chao, Tan
Author_Institution :
School of Computer and Information Engineer, Anyang Normal University, China
Abstract :
To segment the Oracle-Bone Inscriptions correctly based on the grammar and the dictionary of the Shang Oracle-Bone Inscriptions, is the prerequisite and basis for the establishment of the corpus of the Shang Oracle-Bone Inscriptions to realize Computer Aided textual explanation about the Shang Oracle-Bone Inscriptions. This paper takes the technology of modern Chinese word segmentation, combination of mechanical word segmentation and Characteristics scanning. In the experiment, 200 pieces of the segmentation of the Shang Oracle-Bone Inscriptions are the experimental samples. By experts, the correct rate of the segmentation result drawing from the experiment can reach above 90%. So, whether the correct rate and the segmentation efficiency is good. The technology of combination of mechanical word segmentation and Characteristics scanning to segment the Shang Oracle-Bone Inscriptions is an efficiency method.
Keywords :
Computers; Dictionaries; Grammar; Information processing; Laboratories; Tagging; Characteristics scanning; Lucene; mechanical word segmentation; the Oracle-Bone Inscriptions;
Conference_Titel :
Information Science and Engineering (ICISE), 2010 2nd International Conference on
Conference_Location :
Hangzhou, China
Print_ISBN :
978-1-4244-7616-9
DOI :
10.1109/ICISE.2010.5690528