Title of article :
Adapting Searchy to extract data using evolved wrappers
Author/Authors :
Barrero، نويسنده , , David F. and R-Moreno، نويسنده , , Marيa D. and Camacho، نويسنده , , David، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2012
Abstract :
Organizations need diverse information systems to deal with the increasing requirements in information storage and processing, yielding the creation of information islands and therefore an intrinsic difficulty to obtain a global view. Being able to provide such an unified view of the -likely heterogeneous-information available in an organization is a goal that provides added-value to the information systems and has been subject of intense research. In this paper we present an extension of a solution named Searchy, an agent-based mediator system specialized in data extraction and Integration. Through the use of a set of wrappers, it integrates information from arbitrary sources and semantically translates them according to a mediated scheme. Searchy is actually a domain-independent wrapper container that ease wrapper development, providing, for example, semantic mapping. The extension of Searchy proposed in this paper introduces an evolutionary wrapper that is able to evolve wrappers using regular expressions. To achieve this, a Genetic Algorithm (GA) is used to learn a regex able to extract a set of positive samples while rejects a set of negative samples.
Keywords :
Wrappers , Genetic algorithms , Information extraction
Journal title :
Expert Systems with Applications
Journal title :
Expert Systems with Applications