DocumentCode :
2053700
Title :
On Refining Real-Time Multilingual News Event Extraction through Deployment of Cross-Lingual Information Fusion Techniques
Author :
Piskorski, Jakub ; Belayeva, Jenya ; Atkinson, Martin
Author_Institution :
Inst. of Comput. Sci., Warsaw, Poland
fYear :
2011
fDate :
12-14 Sept. 2011
Firstpage :
38
Lastpage :
45
Abstract :
Nowadays, many influential security-related facts are reported multiple times by different sources and in different languages. Therefore, in the recent years, the research on advancing event extraction technology shifted from classical single-document extraction toward cross-document information aggregation and fact validation. However, relatively little work has been reported on cross-lingual information fusion in this area. This paper presents the results of some preliminary experiments on deploying cross-lingual information fusion techniques for refining the results of a large-scale multilingual news event extraction system. The first technique is based on fusing the responses of the mono-lingual event extraction systems, whereas the second one uses state-of-the-art machine translation to convert all news articles reporting on a given event into one common language and subsequently applies the corresponding mono-lingual event extraction system on the translated articles. An evaluation of the aforementioned techniques on a news article corpus, whose articles refer to 523 real-world crisis-related events (violent events, man-made and natural disasters), revealed that the descriptions of circa 10% of the events could be refined through fusing the event descriptions returned by the mono-lingual event extraction systems. The overall gain in recall and precision against the best mono-lingual system was 6,4% and 4,8% respectively. The second approach, based on machine translation, turned to perform significantly worse compared to the former technique and the best mono-lingual system (English).
Keywords :
information resources; language translation; natural language processing; sensor fusion; English; advancing event extraction technology; crisis-related events; cross-document information aggregation; cross-lingual information fusion techniques; fact validation; large-scale multilingual news event extraction system; machine translation; man-made; mono-lingual event extraction systems; natural disasters; real-time multilingual news event extraction; security-related facts; single-document extraction; violent events; Data mining; Grammar; Monitoring; Real time systems; Reliability; Security; Weapons; information fusion; multilinguality; natural language processing; news event extraction; text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligence and Security Informatics Conference (EISIC), 2011 European
Conference_Location :
Athens
Print_ISBN :
978-1-4577-1464-1
Electronic_ISBN :
978-0-7695-4406-9
Type :
conf
DOI :
10.1109/EISIC.2011.72
Filename :
6061189
Link To Document :
بازگشت