• DocumentCode
    659641
  • Title

    Risk adjustment of patient expenditures: A big data analytics approach

  • Author

    Lin Li ; Bagheri, Saeed ; Goote, Helena ; Hasan, Aftab ; Hazard, Gregg

  • Author_Institution
    Philips Res. North America, Briarcliff Manor, NY, USA
  • fYear
    2013
  • fDate
    6-9 Oct. 2013
  • Firstpage
    12
  • Lastpage
    14
  • Abstract
    For healthcare applications, voluminous patient data contain rich and meaningful insights that can be revealed using advanced machine learning algorithms. However, the volume and velocity of such high dimensional data requires new big data analytics framework where traditional machine learning tools cannot be applied directly. In this paper, we introduce our proof-of-concept big data analytics framework for developing risk adjustment model of patient expenditures, which uses the “divide and conquer” strategy to exploit the big-yet-rich data to improve the model accuracy. We leverage the distributed computing platform, e.g., MapReduce, to implement advanced machine learning algorithms on our data set. In specific, random forest regression algorithm, which is suitable for high dimensional healthcare data, is applied to improve the accuracy of our predictive model. Our proof-of-concept framework demonstrates the effectiveness of predictive analytics using random forest algorithm as well as the efficiency of the distributed computing platform.
  • Keywords
    Big Data; data analysis; distributed processing; divide and conquer methods; health care; learning (artificial intelligence); medical information systems; regression analysis; MapReduce; big data analytics approach; big-yet-rich data; distributed computing platform; divide and conquer strategy; healthcare applications; high dimensional data; high dimensional healthcare data; machine learning algorithms; machine learning tools; model accuracy; patient data; patient expenditures; predictive analytics; predictive model; proof-of-concept big data analytics framework; proof-of-concept framework; random forest regression algorithm; risk adjustment; Computational modeling; Data handling; Data models; Data storage systems; Information management; Linear regression; Predictive models; Distributed Computing; Healthcare Big Data; Patient Expenditure; Random Forest; Risk Adjustment;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data, 2013 IEEE International Conference on
  • Conference_Location
    Silicon Valley, CA
  • Type

    conf

  • DOI
    10.1109/BigData.2013.6691790
  • Filename
    6691790