DocumentCode
3717173
Title
Brown Dog: Leveraging everything towards autocuration
Author
Smruti Padhy;Greg Jansen;Jay Alameda;Edgar Black;Liana Diesendruck;Mike Dietze;Praveen Kumar;Rob Kooper;Jong Lee;Rui Liu;Richard Marciano;Luigi Marini;Dave Mattson;Barbara Minsker;Chris Navarro;Marcus Slavenas;William Sullivan;Jason Votava;Inna Zharnitsky
Author_Institution
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
fYear
2015
Firstpage
493
Lastpage
500
Abstract
We present Brown Dog, two highly extensible services that aim to leverage any existing pieces of code, libraries, services, or standalone software (past or present) towards providing users with a simple to use and programmable means of automated aid in the curation and indexing of distributed collections of uncurated and/or unstructured data. Data collections such as these encompassing large varieties of data, in addition to large amounts of data, pose a significant challenge within modern day "Big Data" efforts. The two services, the Data Access Proxy (DAP) and the Data Tilling Service (DTS), focusing on format conversions and content based analysis/extraction respectively, wrap relevant conversion and extraction operations within arbitrary software, manages their deployment in an elastic manner, and manages job execution from behind a deliberately compact REST API. We describe both the motivation and need/scientific drivers for such services, the constituent components that allow for arbitrary software/code to be used and managed, and lastly an evaluation of the systems capabilities and scalability.
Keywords
"Data mining","Metadata","Software","Libraries","Big data","Indexing"
Publisher
ieee
Conference_Titel
Big Data (Big Data), 2015 IEEE International Conference on
Type
conf
DOI
10.1109/BigData.2015.7363791
Filename
7363791
Link To Document