Title :
Text and structured data fusion in data tamer at scale
Author :
Gubanov, Michael ; Stonebraker, M. ; Bruckner, Dietmar
Author_Institution :
MIT CSAIL, Cambridge, MA, USA
fDate :
March 31 2014-April 4 2014
Abstract :
Large-scale text data research has recently started to regain momentum [1]-[10], because of the wealth of up to date information communicated in unstructured format. For example, new information in online media (e.g. Web blogs, Twitter, Facebook, news feeds, etc) becomes instantly available and is refreshed regularly, has very broad coverage and other valuable properties unusual for other data sources and formats. Therefore, many enterprises and individuals are interested in integrating and using unstructured text in addition to their structured data.
Keywords :
data integration; data structures; sensor fusion; text analysis; DATA TAMER; data cleaning; data formats; data integration system; data transformations; entity consolidation module; expert-sourcing mechanism; human guidance; large-scale text data research; online media; schema integration facility; structured data fusion; structured data sources; text fusion; Blogs; Cleaning; Data integration; Distributed databases; Media; Motion pictures; Schedules;
Conference_Titel :
Data Engineering (ICDE), 2014 IEEE 30th International Conference on
Conference_Location :
Chicago, IL
DOI :
10.1109/ICDE.2014.6816755