• DocumentCode
    610402
  • Title

    HFMS: Managing the lifecycle and complexity of hybrid analytic data flows

  • Author

    Simitsis, Alkis ; Wilkinson, K. ; Dayal, U. ; Meichun Hsu

  • Author_Institution
    HP Labs., Palo Alto, CA, USA
  • fYear
    2013
  • fDate
    8-12 April 2013
  • Firstpage
    1174
  • Lastpage
    1185
  • Abstract
    To remain competitive, enterprises are evolving their business intelligence systems to provide dynamic, near realtime views of business activities. To enable this, they deploy complex workflows of analytic data flows that access multiple storage repositories and execution engines and that span the enterprise and even outside the enterprise. We call these multi-engine flows hybrid flows. Designing and optimizing hybrid flows is a challenging task. Managing a workload of hybrid flows is even more challenging since their execution engines are likely under different administrative domains and there is no single point of control. To address these needs, we present a Hybrid Flow Management System (HFMS). It is an independent software layer over a number of independent execution engines and storage repositories. It simplifies the design of analytic data flows and includes optimization and executor modules to produce optimized executable flows that can run across multiple execution engines. HFMS dispatches flows for execution and monitors their progress. To meet service level objectives for a workload, it may dynamically change a flow´s execution plan to avoid processing bottlenecks in the computing infrastructure. We present the architecture of HFMS and describe its components. To demonstrate its potential benefit, we describe performance results for running sample batch workloads with and without HFMS. The ability to monitor multiple execution engines and to dynamically adjust plans enables HFMS to provide better service guarantees and better system utilization.
  • Keywords
    competitive intelligence; data analysis; storage management; HFMS; business intelligence system; complexity management; executor module; hybrid analytic data flow; hybrid flow management system; independent execution engine; independent software layer; lifecycle management; multiengine flows hybrid flow; optimization module; storage repository; system utilization; workload management; Business; Connectors; Databases; Engines; Fault tolerance; Monitoring; Optimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2013 IEEE 29th International Conference on
  • Conference_Location
    Brisbane, QLD
  • ISSN
    1063-6382
  • Print_ISBN
    978-1-4673-4909-3
  • Electronic_ISBN
    1063-6382
  • Type

    conf

  • DOI
    10.1109/ICDE.2013.6544907
  • Filename
    6544907