Title :
Integrating Full-Text Search and Linguistic Analyses on Disperse Data for Humanities and Social Sciences Research Projects
Author :
Villegas, Marta ; Parra, Carla
Author_Institution :
Inst. Univ. de Linguistica Aplic., Univ. Pompeu Fabra, Barcelona, Spain
Abstract :
The research reported in this paper is part of the activities carried out within the CLARIN (common language resources and technology infrastructure) project, a large-scale pan-European project to create, coordinate and make language resources and technologies (LRT) available and readily useable. CLARIN is devoted to the creation of a persistent and stable infrastructure serving the needs of the European humanities and social sciences (HSS) research community. HSS researchers will be able to efficiently access distributed resources and apply analysis and exploitation tools relevant for their research. Hereby we present a real use case addressed as a CLARIN scenario and the implementation of a demonstrator that enables us to foresee the potential problems and contributes to the planning of the implementation phase. It deals with how to support researchers interested in harvesting and analyzing data from historical press archives. Therefore, we address the integration and interoperability of distributed and heterogeneous research data and analysis tools.
Keywords :
data analysis; linguistics; open systems; query formulation; social sciences; CLARIN; common language resources and technology infrastructure; disperse data; full-text search; humanities; interoperability; linguistic analyses; social sciences; Computer aided software engineering; Data analysis; Large scale integration; Large-scale systems; Light rail systems; Natural language processing; Proposals; Service oriented architecture; Text analysis; Wheels; Humanities & Social Sciences; Linguistic analysis tools; integration and interoperability; textual data harvesting;
Conference_Titel :
e-Science, 2009. e-Science '09. Fifth IEEE International Conference on
Conference_Location :
Oxford
Print_ISBN :
978-0-7695-3877-8
DOI :
10.1109/e-Science.2009.12