DocumentCode
928817
Title
Polishing blemishes: issues in data correction
Author
Teng, Choh Man
Author_Institution
Inst. for Human & Machine Cognition, Pensacola, FL, USA
Volume
19
Issue
2
fYear
2004
Firstpage
34
Lastpage
39
Abstract
Data quality is crucial to any data analysis task. Many imperfection-handling techniques avoid overfitting or simply remove offending portions of the data. Polishing identifies blemishes in the data and makes corrections to retain and recover as much information as possible. When using information collected from channels susceptible to disturbances, data quality is a concern-especially when the primary objective is to assimilate and understand the data. Imperfections can arise from many sources, including transmission and bandwidth constraints, faults in sensor devices, irregularities in sampling, and transcription errors. An intuitive application that exemplifies handling data imperfections is the spell-checker. Developing such a spell-checker would require novel techniques for repairing data imperfections. We are exploring such techniques using a data correction method called polishing. Here, we compare polishing to two alternative approaches to handling data imperfections, focusing on how to evaluate and validate data correction mechanisms.
Keywords
data analysis; data handling; data integrity; data mining; bandwidth constraint; data analysis task; data correction method; data imperfection-handling technique; data quality; spell-checker; transcription error; Bandwidth; Cognition; Data analysis; Data mining; Filtering; Humans; Intelligent sensors; Machine learning algorithms; Noise robustness; Sampling methods;
fLanguage
English
Journal_Title
Intelligent Systems, IEEE
Publisher
ieee
ISSN
1541-1672
Type
jour
DOI
10.1109/MIS.2004.1274909
Filename
1274909
Link To Document