DocumentCode :
234725
Title :
A Two-Stage Approach for Reconstruction of Cross-Cut Shredded Text Documents
Author :
Ya Wang ; Ding-Cheng Ji
Author_Institution :
Dept. of Basic Sci., Taizhou Coll. of Nanjing Univ. of Sci. & Tech., Taizhou, China
fYear :
2014
fDate :
15-16 Nov. 2014
Firstpage :
12
Lastpage :
16
Abstract :
This paper presents a two-stage approach for reconstruction of cross-cut shredded text documents. Cross-cut shredding is used to mechanically cut a document into rectangular shreds of (almost) identical shapes. After pre-processing shreds with image-based techniques, we defined a cluster quality measure called "matching proportion" (MP), with which, shreds in the same rows were found by clustering. Then the shreds in each cluster (row) were aligned and the whole document was reconstructed by aligning all rows. All the alignments were done by a memetic algorithm which was extended from a genetic algorithm by embedding a probabilistic Kruskal based heuristic. Experiments were presented for two different instances. Results show that the two-stage approach is an appropriate reconstruction method which provides good solutions in a reasonable amount of time.
Keywords :
document image processing; genetic algorithms; image matching; image reconstruction; text analysis; cross-cut shredded text documents reconstruction; genetic algorithm; image-based techniques; matching proportion; memetic algorithm; probabilistic Kruskal based heuristic; two-stage approach; Clustering algorithms; Genetic algorithms; Heuristic algorithms; Image reconstruction; Indexes; Sociology; Statistics; Kruskal heuristic; cluster analysis; document reconstruction; genetic algorithms; memetic algorithms;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Security (CIS), 2014 Tenth International Conference on
Conference_Location :
Kunming
Print_ISBN :
978-1-4799-7433-7
Type :
conf
DOI :
10.1109/CIS.2014.92
Filename :
7016843
Link To Document :
بازگشت