Title :
Representations and metrics for off-line handwriting segmentation
Author :
Breuel, Thomas M.
Author_Institution :
PARC, Palo Alto, CA, USA
Abstract :
Segmentation is a key step in many off-line handwriting recognition systems but, to date, there are almost no ground truth segmentation databases and no widely accepted and formally defined metrics for segmentation performance. This paper proposes a representation of segmentations and presegmentations in terms of color images. Such representations allow convenient interchange of ground truth and hypothesized segmentations in the form of standard image formats. The paper formally defines the notions of oversegmentation and undersegmentation in terms of the maximal bipartite match between corresponding pixels. It also defines a number of metrics that quantify the frequency and extent of events in handwriting like kerning, splitting, and merging of characters. It is hoped that these metrics and representations will find wider use in the community and serve as a basis for creating standard training and test databases of segmentation data.
Keywords :
handwritten character recognition; image representation; image segmentation; character kerning; character merging; character splitting; color images; ground truth segmentation databases; maximal bipartite match; off-line handwriting segmentation metrics; off-line handwriting segmentation representations; oversegmentation; presegmentations; standard image formats; undersegmentation; Color; Frequency; Graphics; Handwriting recognition; Image databases; Image segmentation; Merging; Pixel; System performance; Testing;
Conference_Titel :
Frontiers in Handwriting Recognition, 2002. Proceedings. Eighth International Workshop on
Print_ISBN :
0-7695-1692-0
DOI :
10.1109/IWFHR.2002.1030948