DocumentCode :
2060952
Title :
Knowledge Discovery and Data Mining of Free Text Radiology Reports
Author :
Friedlin, Jeffrey ; Mahoui, Malika ; Jones, Josette ; Jamieson, Patrick
Author_Institution :
Regenstrief Inst., Indiana Univ., Indianapolis, IN, USA
fYear :
2011
fDate :
26-29 July 2011
Firstpage :
89
Lastpage :
96
Abstract :
Medical Knowledge Discovery and Data Mining (KDD) over text is a promising yet difficult technology for unlocking meaning and uncovering associations in vast clinical text repositories. We report our experience in developing a new text analytic system called MEDAT or Medical Exploratory Data Analysis over Text, which overcomes several problems in text mining. The MEDAT system employs an annotated semantic index with a large number of assertions (propositions). The semantic index is able to capture complex assertions which encapsulate conceptual relationships including their modifiers at a granular level. The index represents semantically equivalent sentences with the same symbols, a necessary component for KDD semantic queries, including semantic Boolean and correlation queries. The graphical user interface enables users to perform complex semantic analysis of the Roentgen corpus, consisting of 594,000 de-identified radiology reports with 4.3 million sentences, without having to learn a programming language. The MEDAT architecture offers a novel framework for text mining in other medical domains.
Keywords :
Boolean functions; brain; computational linguistics; data mining; graphical user interfaces; medical administrative data processing; medical computing; patient diagnosis; radiology; semantic networks; text analysis; KDD semantic query; MEDAT; Medical Exploratory Data Analysis over Text; Roentgen corpus; annotated semantic index; clinical text repository; complex assertion; complex semantic analysis; conceptual relationship; correlation query; data association; data mining; free text radiology reports; granular level; graphical user interface; knowledge discovery; modifier; semantic Boolean query; semantically equivalent sentences; text analytic system; text mining; Educational institutions; Heart; Indexes; Medical diagnostic imaging; Ontologies; Radiology; Semantics; Corpus Linguistics; Data Mining; Knowledge Discovery; Natural Language Processing; Semantic Annotation; Semantic Search; Text Analytics; Text Mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Healthcare Informatics, Imaging and Systems Biology (HISB), 2011 First IEEE International Conference on
Conference_Location :
San Jose, CA
Print_ISBN :
978-1-4577-0325-6
Electronic_ISBN :
978-0-7695-4407-6
Type :
conf
DOI :
10.1109/HISB.2011.31
Filename :
6061459
Link To Document :
بازگشت