DocumentCode
2292279
Title
Stalker, A Multilingual Text Mining Search Engine for Open Source Intelligence
Author
Neri, F. ; Pettoni, Ten Col M.
Author_Institution
Synthema, Pisa
fYear
2008
fDate
9-11 July 2008
Firstpage
314
Lastpage
320
Abstract
The revolution in information technology is making open sources more accessible, ubiquitous, and valuable. The international Intelligence Communities have seen open sources grow increasingly easier and cheaper to acquire in recent years. But up to 80% of electronic data is textual and most valuable information is often hidden and encoded in pages which are neither structured, nor classified. The process of accessing all these raw data, heterogeneous in terms of source and language, and transforming them into information is therefore strongly linked to automatic textual analysis and synthesis, which are greatly related to the ability to master the problems of multilinguality. This paper describes a content enabling system that provides deep semantic search and information access to large quantities of distributed multimedia data for both experts and general public. STALKER provides with a language independent search and dynamic classification features for a broad range of data collected from several sources in a number of culturally diverse languages.
Keywords
information retrieval; search engines; text analysis; STALKER; automatic textual analysis; information access; international intelligence communities; multilingual text mining search engine; open source intelligence; semantic search; Databases; Information analysis; Information retrieval; Information security; Information systems; Information technology; Internet; Natural languages; Search engines; Text mining; focused crawling; functional analysis; morphological analysis; natural language processing; open source intelligence; supervised clustering; syntactic analysis; unsupervised clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Visualisation, 2008. IV '08. 12th International Conference
Conference_Location
London
ISSN
1550-6037
Print_ISBN
978-0-7695-3268-4
Type
conf
DOI
10.1109/IV.2008.9
Filename
4577965
Link To Document