DocumentCode :
610252
Title :
News auto-tagging using Wikipedia
Author :
Shams Eldin, Shaimaa ; El-Beltagy, S.R.
Author_Institution :
Center for Inf. Sci., Nile Univ., Giza, Egypt
fYear :
2013
fDate :
17-19 March 2013
Firstpage :
158
Lastpage :
163
Abstract :
This paper presents an efficient method for automatically annotating Arabic news stories with tags using Wikipedia. The idea of the system is to use Wikipedia article names, properties, and re-directs to build a pool of meaningful tags. Sophisticated and efficient matching methods are then used to detect text fragments in input news stories that correspond to entries in the constructed tag pool. Generated tags represent real life entities or concepts such as the names of popular places, known organizations, celebrities, etc. These tags can be used indirectly by a news site for indexing, clustering, classification, statistics generation or directly to give a news reader an overview of news story contents. Evaluation of the system has shown that the tags it generates are better than those generated by MSN Arabic news.
Keywords :
Web sites; indexing; natural language processing; pattern clustering; pattern matching; text analysis; MSN Arabic news; Wikipedia article names; Wikipedia article properties; automatic Arabic news story annotation; input news story text fragment detection; matching methods; news auto-tagging; news site classification; news site clustering; news site indexing; news story contents; real life entities; statistics generation; Dictionaries; Electronic publishing; Encyclopedias; Indexing; Internet; Tagging; Arabic text; Disambiguation; Tagging; Wikipedia;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Innovations in Information Technology (IIT), 2013 9th International Conference on
Conference_Location :
Abu Dhabi
Type :
conf
DOI :
10.1109/Innovations.2013.6544411
Filename :
6544411
Link To Document :
بازگشت