Title :
The A2iA Multi-lingual Text Recognition System at the Second Maurdor Evaluation
Author :
Moysset, Bastien ; Bluche, Theodore ; Knibbe, Maxime ; Benzeghiba, Mohamed Faouzi ; Messina, Ronaldo ; Louradour, Jerome ; Kermorvant, Christopher
Author_Institution :
A2iA, Paris, France
Abstract :
This paper describes the system submitted by A2iA to the second Maurdor evaluation for multi-lingual text recognition. A system based on recurrent neural networks and weighted finite state transducers was used both for printed and handwritten recognition, in French, English and Arabic. To cope with the difficulty of the documents, multiple text line segmentations were considered. An automatic procedure was used to prepare annotated text lines needed for the training of the neural network. Language models were used to decode sequences of characters or words for French and English and also sequences of part-of-arabic words (PAWs) in case of Arabic. This system scored first at the second Maurdor evaluation for both printed and handwritten text recognition in French, English and Arabic.
Keywords :
document image processing; handwriting recognition; image segmentation; learning (artificial intelligence); natural language processing; recurrent neural nets; text detection; A2iA multilingual text recognition system; Arabic; English; French; Maurdor evaluation; PAW; annotated text lines; handwritten recognition; handwritten text recognition; language model; multiple text line segmentation; neural network training; part-of-arabic words; printed recognition; printed text recognition; recurrent neural network; weighted finite state transducer; Error analysis; Handwriting recognition; Hidden Markov models; Image segmentation; Text recognition; Training; Training data; Handwriting recognition; Maurdor evaluation; OCR;
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location :
Heraklion
Print_ISBN :
978-1-4799-4335-7
DOI :
10.1109/ICFHR.2014.57