• DocumentCode
    592106
  • Title

    Prediction of Infectious Disease Spread Using Twitter: A Case of Influenza

  • Author

    Hirose, Hideo ; Liangliang Wang

  • Author_Institution
    Sch. of Comput. Sci. & Syst. Eng., Kyushu Inst. of Technol., Fukuoka, Japan
  • fYear
    2012
  • fDate
    17-20 Dec. 2012
  • Firstpage
    100
  • Lastpage
    105
  • Abstract
    Nowadays, detecting the disaster phenomena and predicting the final stage become very important in the risk analysis view-point. The statistical methods provide accurate estimates of parameters when the data are completely given. However, when the data are incomplete, the accuracy of the estimates becomes poor. Therefore, statistical methods are weak in predicting the future trends. The SIR methods, for infectious disease spread prediction, using the differential equations can sometimes provide accurate estimates for the final stage. These methods, however, require some inspection time, which means the delay of analysis at least one week or so when we want to predict the future trends. To detect the disasters and to predict the future trends much earlier, we can use the social network system (SNS). In this paper, we have proposed a method to predict the future trend of influenza by using Twitter. We have analyzed the possibility of building a regression model by combining Twitter messages and CDC´s Influenza-Like Illness (ILI) data, and we have found that the multiple linear regression model with ridge regularization outperforms the single linear regression model and other un-regularized least squared methods. The model of multiple linear regression with ridge can notably improve the prediction accuracy.
  • Keywords
    diseases; inspection; least squares approximations; prediction theory; regression analysis; risk analysis; social networking (online); CDC ILI data; CDC influenza-like illness data; SIR methods; SNS; Twitter messages; differential equations; disaster phenomena detection; infectious disease spread prediction; inspection time; multiple linear regression; multiple linear regression model; prediction accuracy; regression model; ridge regularization; risk analysis view-point; social network system; statistical methods; unregularized least squared methods; Accuracy; Computational modeling; Data models; Linear regression; Market research; Mathematical model; Twitter; AIC; ILI; SNS; Twitter; early detection; infectious disease; influenza; logistic regression; ridge; truncated data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Architectures, Algorithms and Programming (PAAP), 2012 Fifth International Symposium on
  • Conference_Location
    Taipei
  • ISSN
    2168-3034
  • Print_ISBN
    978-1-4673-4566-8
  • Type

    conf

  • DOI
    10.1109/PAAP.2012.23
  • Filename
    6424743