Title :
Machine learning for the automatic identification of terrorist incidents in worldwide news media
Author :
Mason, Richard ; McInnis, Brian ; Dalal, Siddhartha
Author_Institution :
RAND Corp., Santa Monica, CA, USA
Abstract :
The RAND Database of Worldwide Terrorism Incidents (RDWTI) seeks to index information about all terrorist incidents that occur and are mentioned in worldwide news media, providing a useful resource for policy researchers and decision makers. We examined automated classification methods that could be used to identify news articles about terrorist incidents, thus enabling analysts to read a smaller number of news articles and maintain the database with less effort and cost. The support vector machine (SVM) and Lasso methods were only modestly successful, but a classifier based on the gradient boosting method (GBM) appeared to be very successful, correctly ranking 80% of the relevant articles at the “top of the pile” for examination by a human analyst.
Keywords :
database management systems; gradient methods; learning (artificial intelligence); multimedia computing; support vector machines; terrorism; GBM; Lasso methods; RAND database; RDWTI; SVM; automated classification methods; automatic identification; decision makers; gradient boosting method; index information; machine learning; support vector machine; terrorist incidents; top of the pile; worldwide news media; worldwide terrorism incidents; Databases; Humans; Standards; Support vector machines; Terrorism; Training; Training data; ElasticNet; Gradient Boosting; Lasso; Support Vector Machine; machine learning; news articles; terrorism;
Conference_Titel :
Intelligence and Security Informatics (ISI), 2012 IEEE International Conference on
Conference_Location :
Arlington, VA
Print_ISBN :
978-1-4673-2105-1
DOI :
10.1109/ISI.2012.6284096