• DocumentCode
    311134
  • Title

    A language model based on semantically clustered words in a Chinese character recognition system

  • Author

    Lee, Hsi-Jian ; Tung, Cheng-Huang

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
  • Volume
    1
  • fYear
    1995
  • fDate
    14-16 Aug 1995
  • Firstpage
    450
  • Abstract
    This paper presents a new method for clustering the words in a dictionary into word groups, which are applied in a Chinese character recognition system with a language model to describe the contextual information. The Chinese synonym dictionary Tong2yi4ci2 ci2lin2 providing the semantic features is used to train the weights of the semantic attributes of the character-based word classes. The weights of the semantic attributes are next updated according to the words of the behavior dictionary, which has a rather complete word set. Then, the updated word classes are clustered into m groups according to the semantic measurement by a greedy method. The words in the behavior dictionary can finally be assigned into the m groups. The parameter space for bigram contextual information of the character recognition system is m2. From the experimental results, the recognition system with the proposed model has shown better performance than that of a character-based bigram language model
  • Keywords
    character recognition; computational linguistics; Chinese character recognition; Chinese synonym dictionary; Tong2yi4ci2 ci2lin2; behavior dictionary; character recognition system; language model; semantic attributes; semantically clustered words; Character recognition; Computer science; Context modeling; Dictionaries; Error correction; Natural languages; Postal services; Random access memory;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
  • Conference_Location
    Montreal, Que.
  • Print_ISBN
    0-8186-7128-9
  • Type

    conf

  • DOI
    10.1109/ICDAR.1995.599033
  • Filename
    599033