Title :
Using social network knowledge for detecting spider constructions in social security fraud
Author :
Van Vlasselaer, Veronique ; Meskens, Jan ; Van Dromme, Dries ; Baesens, Bart
Author_Institution :
Dept. of Decision Sci. & Inf. Manage., Katholieke Univ. Leuven, Leuven, Belgium
Abstract :
As social networks offer a vast amount of additional information to enrich standard learning algorithms, the most challenging part is extracting relevant information from networked data. Fraudulent behavior is imperceptibly concealed both in local and relational data, making it even harder to define useful input for prediction models. Starting from expert knowledge, this paper succeeds to efficiently incorporate social network effects to detect fraud for the Belgian governmental social security institution, and to improve the performance of traditional non-relational fraud prediction tasks. As there are many types of social security fraud, this paper concentrates on payment fraud, predicting which companies intentionally disobey their payment duties to the government. We introduce a new fraudulent structure, the so-called spider constructions, which can easily be translated in terms of social networks and included in the learning algorithms. Focusing on the egonet of each company, the proposed method can handle large scale networks. In order to face the skewed class distribution, the SMOTE approach is applied to rebalance the data. The models were trained on different timestamps and evaluated on varying time windows. Using techniques as Random Forest, logistic regression and Naive Bayes, this paper shows that the combined relational model improves the AUC score and the precision of the predictions in comparison to the base scenario where only local variables are used.
Keywords :
Bayes methods; financial data processing; fraud; learning (artificial intelligence); regression analysis; security of data; social networking (online); AUC score; Belgian governmental social security institution; SMOTE approach; egonet; fraud detection; fraudulent behavior; fraudulent structure; information extraction; local data; local variables; logistic regression; naive Bayes; nonrelational fraud prediction tasks; payment fraud; random forest; relational data; skewed class distribution; social network knowledge; social security fraud; spider construction detection; standard learning algorithms; Companies; Face; Government; Indexes; Logistics; Manuals; Suspensions;
Conference_Titel :
Advances in Social Networks Analysis and Mining (ASONAM), 2013 IEEE/ACM International Conference on
Conference_Location :
Niagara Falls, ON