DocumentCode :
168347
Title :
Towards a stratified learning approach to predict future citation counts
Author :
Chakraborty, Tamal ; Kumar, Sudhakar ; Goyal, Puneet ; Ganguly, Niloy ; Mukherjee, Arjun
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Kharagpur, Kharagpur, India
fYear :
2014
fDate :
8-12 Sept. 2014
Firstpage :
351
Lastpage :
360
Abstract :
In this paper, we study the problem of predicting future citation count of a scientific article after a given time interval of its publication. To this end, we gather and conduct an exhaustive analysis on a dataset of more than 1.5 million scientific papers of computer science domain. On analysis of the dataset, we notice that the citation count of the articles over the years follows a diverse set of patterns; on closer inspection we identify six broad categories of citation patterns. This important observation motivates us to adopt stratified learning approach in the prediction task, whereby, we propose a two-stage prediction model - in the first stage, the model maps a query paper into one of the six categories, and then in the second stage a regression module is run only on the subpopulation corresponding to that category to predict the future citation count of the query paper. Experimental results show that the categorization of this huge dataset during the training phase leads to a remarkable improvement (around 50%) in comparison to the well-known baseline system.
Keywords :
citation analysis; publishing; query processing; regression analysis; baseline system; citation patterns; computer science domain; future citation count prediction; huge dataset; prediction task; publication; query paper; regression module; scientific article; scientific papers; stratified learning approach; two-stage prediction model; Abstracts; Accuracy; Computer science; Predictive models; Productivity; Support vector machines; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Libraries (JCDL), 2014 IEEE/ACM Joint Conference on
Conference_Location :
London
Type :
conf
DOI :
10.1109/JCDL.2014.6970190
Filename :
6970190
Link To Document :
بازگشت