• DocumentCode
    3200241
  • Title

    Comparative study of an HIV risk scorecard and regression models to rank effects of demographic characteristics on risk of aquiring an HIV infection

  • Author

    Sibanda, Wilbert ; Pretorius, Philip

  • Author_Institution
    Fac. of Health Sci., North-West Univ., Potchefstroom, South Africa
  • fYear
    2013
  • fDate
    18-21 Dec. 2013
  • Firstpage
    57
  • Lastpage
    64
  • Abstract
    This research paper covers the development of an HIV risk scorecard using SAS Enterprise Miner™. The HIV risk scorecard was developed using the 2007 South African annual antenatal HIV and syphilis seroprevalence data. Limited comparisons are made with a more recent 2010 antenatal database. Antenatal data contains various demographic characteristics for each pregnant woman, such as pregnant woman´s age, male sexual partner´s age, population group, level of education, gravidity, parity, HIV and syphilis status. The purpose of this research was to use a scorecard to rank the effects of the demographic characteristics on influencing a pregnant woman´s risk of acquiring an HIV infection. The project encompassed the selection of the data sample, classing, selection of demographic characteristics, fitting of a regression model, generation of weights-of-evidence (WOE), calculation of information values (IVs), creation and validation of an HIV risk scorecard. The educational level and syphilis status of the pregnant women produced information values below 0.05 and were rejected from inclusion in the final HIV risk scorecard. Based on their respective information values, the following four demographic characteristics of the pregnant women were found to be of medium predictive strength and thus included in the final HIV risk scorecard; pregnant woman´s age, age of male sexual partner, gravidity and parity. The age of the pregnant woman had the highest information value and Gini coefficient. The final objective of this research was to demonstrate that a binned variable HIV risk scorecard can provide as much risk ranking as any other regression based model.
  • Keywords
    bioinformatics; demography; diseases; regression analysis; Gini coefficient; HIV infection; HIV risk scorecard; SAS Enterprise Miner; antenatal database; data sample selection; demographic characteristics; information values calculation; regression models; weights-of-evidence; Biological system modeling; Educational institutions; Human immunodeficiency virus; Logistics; Pregnancy; Synthetic aperture sonar; HIV; IV; Scorecard; WOE;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine (BIBM), 2013 IEEE International Conference on
  • Conference_Location
    Shanghai
  • Type

    conf

  • DOI
    10.1109/BIBM.2013.6732736
  • Filename
    6732736