DocumentCode :
2079797
Title :
Data cleansing as a transient service
Author :
Faruquie, Tanveer A. ; Prasad, K. Hima ; Subramaniam, L. Venkata ; Mohania, Mukesh ; Venkatachaliah, Girish ; Kulkarni, Shrinivas ; Basu, Pramit
Author_Institution :
IBM India Res. Lab., New Delhi, India
fYear :
2010
fDate :
1-6 March 2010
Firstpage :
1025
Lastpage :
1036
Abstract :
There is often a transient need within enterprises for data cleansing which can be satisfied by offering data cleansing as a transient service. Every time a data cleansing need arises it should be possible to provision hardware, software and staff for accomplishing the task and then dismantling the set up. In this paper we present such a system that uses virtualized hardware and software for data cleansing. We share actual experiences gained from building such a system.We use a cloud infrastructure to offer virtualized data cleansing instances that can be accessed as a service. We build a system that is scalable, elastic and configurable. Each enterprise has unique needs which makes it necessary to customize both the infrastructure and the cleansing algorithms to address these needs. In this paper we will present a system that is easily configurable to suit the data cleansing needs of an enterprise.
Keywords :
Internet; data mining; cleansing algorithms; cloud infrastructure; data cleansing; transient service; virtualized data cleansing; virtualized hardware; virtualized software; Clouds; Costs; Customer service; Databases; Decision making; Delay; Error analysis; Hardware; Investments; Software maintenance;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2010 IEEE 26th International Conference on
Conference_Location :
Long Beach, CA
Print_ISBN :
978-1-4244-5445-7
Electronic_ISBN :
978-1-4244-5444-0
Type :
conf
DOI :
10.1109/ICDE.2010.5447789
Filename :
5447789
Link To Document :
بازگشت