Title :
A Framework for Efficient Data Analytics through Automatic Configuration and Customization of Scientific Workflows
Author :
Hauder, Matheus ; Gil, Yolanda ; Liu, Yan
Author_Institution :
Inst. for Software & Syst. Eng., Univ. of Augsburg, Augsburg, Germany
Abstract :
Data analytics involves choosing between many different algorithms and experimenting with possible combinations of those algorithms. Existing approaches however do not support scientists with the laborious tasks of exploring the design space of computational experiments. We have developed a framework to assist scientists with data analysis tasks in particular machine learning and data mining. It takes advantage of the unique capabilities of the Wings workflow system to reason about semantic constraints. We show how the framework can rule out invalid workflows and help scientists to explore the design space. We demonstrate our system in the domain of text analytics, and outline the benefits of our approach.
Keywords :
data analysis; data mining; learning (artificial intelligence); natural sciences computing; text analysis; workflow management software; Wings workflow system; computational experiment; data analysis; data analytics; data mining; machine learning; scientific workflow configuration; scientific workflow customization; text analytics; Algorithm design and analysis; Clustering algorithms; Correlation; Machine learning algorithms; Prediction algorithms; Software; Software algorithms; Data Analytics; Scientific Workflows;
Conference_Titel :
E-Science (e-Science), 2011 IEEE 7th International Conference on
Conference_Location :
Stockholm
Print_ISBN :
978-1-4577-2163-2
DOI :
10.1109/eScience.2011.59