DocumentCode
3717270
Title
Data deidentification in medical transcriptions using regular expressions and machine learning
Author
Joshua Seeger;Aron Culotta;Jason Keller;Patrick van Kessel;Michael Jugovich
Author_Institution
NORC at the University of Chicago, 1 North State Street, 14th Floor, Chicago, IL 60602
fYear
2015
Firstpage
1322
Lastpage
1323
Abstract
A system is developed to redact personally identifiable information (PII) through a combination of entity recognition, regular expressions, and machine learning with very high precision from millions of medical transcriptions. This system is trained and tested with manually redacted medical transcriptions using an internally developed coding system, providing double blind classification capabilities.
Keywords
"Medical services","Medical diagnostic imaging","Encoding","Pipelines","Manuals","Floors","Big data"
Publisher
ieee
Conference_Titel
Big Data (Big Data), 2015 IEEE International Conference on
Type
conf
DOI
10.1109/BigData.2015.7363889
Filename
7363889
Link To Document