Title :
Detection of Incoherences in a Document Corpus Based on the Application of a Neuro-Fuzzy System
Author :
Martin, Sebastien ; Arribas, V. ; Sainz, Gregorio I.
Author_Institution :
Fundacion CARTIF, Parque Tecnol. de Boecillo, Valladolid, Spain
Abstract :
The aim of this paper is to detect incoherences in concepts, ideas, values, and others contained in technical document corpora. The way in which document collections are generated, modified or updated generates problems and mistakes in the information coherency, leading to legal, economic and social problems. A solution based on summarization, matching and neuro-fuzzy systems is proposed to dealt with this problem. For this goal, every document (from the electric domain) is summarized by its relevant information in the form of 4-tuples of terms, describing the most relevant ideas and concepts that must be free of incoherences. These representations are then matched using several well-known algorithms (Levenshtein distance and cosine similarity). The final decision about the real existence or not of an incoherence, and its relevancy, is obtained by training a neuro-fuzzy system FasArt in a supervised classification process, based on the previous knowledge of the activity area and domain experts. On the other hand, using this fuzzy approach, it is possible to extract the learnt and expert knowledge from the the neuro-fuzzy system, through a set of fuzzy rules that can support a decision taking system about this complex and non objective problem.
Keywords :
document handling; expert systems; fuzzy set theory; learning (artificial intelligence); pattern classification; decision taking system; domain expert system; economic problem; fuzzy set theory; information coherency; neuro-fuzzy system; social problem; supervised classification process; technical document corpora; Documentation; Fuzzy neural networks; Fuzzy sets; Fuzzy systems; Law; Legal factors; Proposals; Quality management; Systems engineering and theory; Text analysis; 4-tuples; Content incoherences; matching techniques; neuro-fuzzy system; summarization; supervised classification;
Conference_Titel :
Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-4500-4
Electronic_ISBN :
1520-5363
DOI :
10.1109/ICDAR.2009.101