DocumentCode :
2646105
Title :
Handwritten text line extraction based on minimum spanning tree clustering
Author :
Yin, Fei ; Liu, Cheng-Lin
Author_Institution :
Inst. of Autom., Chinese Acad. of Sci., Beijing
Volume :
3
fYear :
2007
fDate :
2-4 Nov. 2007
Firstpage :
1123
Lastpage :
1128
Abstract :
Text line extraction from unconstrained handwritten documents is a challenge because the text lines are often skewed and curved and the space between lines is not obvious. To solve this problem, we propose an approach based on minimum spanning tree (MST) clustering with new distance measures. First, the connected components of the document image are grouped into a tree by MST clustering with a new distance measure. The edges of the tree are then dynamically cut to form text lines by using a new objective function for finding the number of clusters. This approach is totally parameter-free and can apply to various documents with multi-skewed and curved lines. Experiments on handwritten Chinese documents demonstrate the effectiveness of the approach.
Keywords :
document image processing; feature extraction; handwritten character recognition; pattern clustering; trees (mathematics); handwritten text line extraction; minimum spanning tree clustering; unconstrained handwritten document image; Character recognition; Notice of Violation; Optical character recognition software; Pattern analysis; Pattern recognition; Performance analysis; Pixel; Strips; Text analysis; Wavelet analysis; Connected component labeling; Handwritten text line extraction; MST clustering; Multi-skewed document; OCR;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Wavelet Analysis and Pattern Recognition, 2007. ICWAPR '07. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-1065-1
Electronic_ISBN :
978-1-4244-1066-8
Type :
conf
DOI :
10.1109/ICWAPR.2007.4421601
Filename :
4421601
Link To Document :
بازگشت