DocumentCode
3320982
Title
A cost-effective, case-control study on the association between breast cancer and pregnancy through web mining
Author
Hong-Jun Yoon ; Songhua Xu ; Tourassi, Georgia
Author_Institution
Comput. Sci. & Eng. Div., Oak Ridge Nat. Lab., Oak Ridge, TN, USA
fYear
2013
fDate
21-23 May 2013
Firstpage
1
Lastpage
4
Abstract
We report a case-control epidemiological study through mining people´s stories from the Internet. Our overarching goal is to test whether mining openly available, personal stories from the Internet is a cost-effective way for reliable epidemiological discoveries. As a case study, we focus on the association between breast cancer risk and pregnancy, which is clearly established through controlled clinical survey studies. Specifically, we automatically collected and mined 30,000 online obituary articles via a series of tailored cyber-informatics tools we developed. Replicating a case-control study design, we analyzed the collected data confirming with statistical significance that parity is associated with lower breast cancer risk. Our web mining study demonstrates promising preliminary evidence that online content mining can be a cost-effective and reliable way for epidemiological knowledge discovery.
Keywords
Internet; cancer; data mining; epidemics; medical computing; Internet; Web mining; breast cancer risk; case-control epidemiological study; case-control study design; epidemiological discoveries; epidemiological knowledge discovery; online content mining; online obituary articles mining; people stories mining; pregnancy; tailored cyber-informatics tools; Breast cancer; History; Internet; Obituaries; Pregnancy; Web mining; breast cancer; case-control study; epidemiology; obituary; web mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Biomedical Sciences and Engineering Conference (BSEC), 2013
Conference_Location
Oak Ridge, TN
Print_ISBN
978-1-4799-2118-8
Type
conf
DOI
10.1109/BSEC.2013.6618493
Filename
6618493
Link To Document