Title :
Question identification on Turkish tweets
Author :
Ozger, Z.B. ; Diri, B. ; Girgin, Canan
Author_Institution :
Comput. Eng. Dept., Yildiz Tech. Univ., İstanbul, Turkey
Abstract :
Question identification is a field Natural Language Processing and also Information Extraction. The aim of work is detecting Turkish tweets which are including question expressions. The application contains three stages: applying some pre-processing steps to data set for cleaning unnecessary data like Retweet, determining candidate tweets via a rule-based method and extracting tweets which are really include questions using Conditional Random Fields. For this purpose one million tweets were collected and labeled. Tweets are ungrammatical data type. According to results; the model developed has been largely successful on tweets. Additionally, it is a first study about identifying questions on Turkish tweets.
Keywords :
data analysis; natural language processing; social networking (online); Turkish tweets; candidate tweets; conditional random fields; data set; information extraction; natural language processing; question expressions; question identification; rule-based method; tweet extraction; ungrammatical data type; Accuracy; Feature extraction; Grammar; Media; Reactive power; Training; Twitter; Conditional Random Fields; Question Identification; Twitter;
Conference_Titel :
Innovations in Intelligent Systems and Applications (INISTA) Proceedings, 2014 IEEE International Symposium on
Conference_Location :
Alberobello
Print_ISBN :
978-1-4799-3019-7
DOI :
10.1109/INISTA.2014.6873608