• DocumentCode
    282174
  • Title

    The use of letter patterns for script recognition

  • Author

    Wells, C.J. ; Evett, L.J. ; Whitby, P.E. ; Whitrow, R.J.

  • Author_Institution
    Dept. of Comput., Trent Polytech., Nottingham, UK
  • fYear
    1989
  • fDate
    32783
  • Firstpage
    42522
  • Lastpage
    42524
  • Abstract
    Addresses the problem of script recognition with ambiguous input from a pattern recognizer. A pattern recognizer produces a number of letter candidates for each letter position of the word it processes. These letter candidates combine to form a number (usually large, often very large) of letter string candidates for each input word of script that is written. The paper considers methods of using orthographic information letter patterns-to reduce this uncertainty. Letter string candidates may be rejected if they contain letter sequences which are not allowable in English (using n-grams), or are not real English words. The major problem with the use of n-grams in this way is that the list of allowable candidate strings remaining after look-up, are not necessarily words. Better reduction is given by comparing the candidate strings with a list of words, which can be obtained from a machine-readable dictionary. Those remaining allowable candidates can be ordered or ranked in accordance with their probability correct from the recognizer. Systems employing a lexical look-up in the past have found it difficult to hold a reasonably large vocabulary in memory, while being searchable in real time. Alternative data structures for representing large lists of words are discussed below in the paper
  • Keywords
    character recognition; character recognition; data structures; letter patterns; letter string candidates; lexical look-up; machine-readable dictionary; n-grams; orthographic information; pattern recognizer; script recognition;
  • fLanguage
    English
  • Publisher
    iet
  • Conference_Titel
    Character Recognition and Applications, IEE Colloquium on
  • Conference_Location
    London
  • Type

    conf

  • Filename
    198765