• DocumentCode
    260373
  • Title

    Handling imbalanced data in customer churn prediction using combined sampling and weighted random forest

  • Author

    Effendy, Veronikha ; Adiwijaya ; Baizal, Z.K.A.

  • Author_Institution
    Telkom Univ., Bandung, Indonesia
  • fYear
    2014
  • fDate
    28-30 May 2014
  • Firstpage
    325
  • Lastpage
    330
  • Abstract
    Customer churn is a major problem that is found in the telecommunications industry because it affects the company´s revenue. At the time of the customer churn is taking place, the percentage of data that describes the customer churn is usually low. Unfortunately, the churn data is the data which have to be predicted earlier. The lack of data on customer churn led to the problem of imbalanced data. The imbalanced data caused difficulties in developing a good prediction model. This research applied a combination of sampling techniques and Weighted Random Forest (WRF) to improve the customer churn prediction model on a sample dataset from a telecommunication industry in Indonesia. WRF claimed can produce a prediction model which has a good performance on the imbalanced data problem. However, this research found that the performance of the prediction model developed by WRF using the dataset is still quite low. Sampling techniques were applied to overcome this problem. This research used the combination of simple under sampling and SMOTE. The result shown that the combined-sampling and WRF could produce a prediction model which had better performance than before.
  • Keywords
    customer relationship management; sampling methods; telecommunication industry; Indonesia; SMOTE; WRF; churn data; combined-sampling; company revenue; customer churn prediction model; imbalanced data; sample dataset; sampling techniques; telecommunications industry; weighted random forest; Companies; Data models; Market research; Predictive models; Telecommunications; Vegetation; Churn; Combined-sampling; Prediction; SMOTE; Weighted Random Forest; simple under sampling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information and Communication Technology (ICoICT), 2014 2nd International Conference on
  • Conference_Location
    Bandung
  • Type

    conf

  • DOI
    10.1109/ICoICT.2014.6914086
  • Filename
    6914086