DocumentCode
168347
Title
Towards a stratified learning approach to predict future citation counts
Author
Chakraborty, Tamal ; Kumar, Sudhakar ; Goyal, Puneet ; Ganguly, Niloy ; Mukherjee, Arjun
Author_Institution
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Kharagpur, Kharagpur, India
fYear
2014
fDate
8-12 Sept. 2014
Firstpage
351
Lastpage
360
Abstract
In this paper, we study the problem of predicting future citation count of a scientific article after a given time interval of its publication. To this end, we gather and conduct an exhaustive analysis on a dataset of more than 1.5 million scientific papers of computer science domain. On analysis of the dataset, we notice that the citation count of the articles over the years follows a diverse set of patterns; on closer inspection we identify six broad categories of citation patterns. This important observation motivates us to adopt stratified learning approach in the prediction task, whereby, we propose a two-stage prediction model - in the first stage, the model maps a query paper into one of the six categories, and then in the second stage a regression module is run only on the subpopulation corresponding to that category to predict the future citation count of the query paper. Experimental results show that the categorization of this huge dataset during the training phase leads to a remarkable improvement (around 50%) in comparison to the well-known baseline system.
Keywords
citation analysis; publishing; query processing; regression analysis; baseline system; citation patterns; computer science domain; future citation count prediction; huge dataset; prediction task; publication; query paper; regression module; scientific article; scientific papers; stratified learning approach; two-stage prediction model; Abstracts; Accuracy; Computer science; Predictive models; Productivity; Support vector machines; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Digital Libraries (JCDL), 2014 IEEE/ACM Joint Conference on
Conference_Location
London
Type
conf
DOI
10.1109/JCDL.2014.6970190
Filename
6970190
Link To Document