• DocumentCode
    3446500
  • Title

    An intelligent ETL workflow framework based on data partition

  • Author

    Tu, Yingying ; Guo, Chaozhen

  • Author_Institution
    Coll. of Math. & Comput. Sci., Fuzhou Univ., Fuzhou, China
  • Volume
    3
  • fYear
    2010
  • fDate
    29-31 Oct. 2010
  • Firstpage
    358
  • Lastpage
    363
  • Abstract
    ETL tool is an important part to build a data warehouse and data centers. For massive data processing, this paper presents an intelligent ETL workflow framework based on the distributed computing servers, adding an intelligent manipulative module, acquiring the data of the system efficiency and resources, operating data, dynamically adjusting the ETL strategy, and doing corresponding data segmentation for larger jobs, realizing workflow optimization for multi-machine parallel execution, improving operational efficiency, and facilitating error recovery. Intelligent control module is composed of the monitor, knowledge base, and the selector. The source data horizontal partition is the basis and difficulty to achieve multi-machine parallel.
  • Keywords
    computer centres; data acquisition; data warehouses; parallel processing; data centers; data processing; data segmentation; data warehouse; distributed computing servers; intelligent ETL workflow optimization framework; intelligent control module; knowledge base; multimachine parallel execution; source data horizontal partition; Decision making; Monitoring; Scheduling; Servers; Distributed Computation; Intelligence ETL; data partition; multi-agent;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Computing and Intelligent Systems (ICIS), 2010 IEEE International Conference on
  • Conference_Location
    Xiamen
  • Print_ISBN
    978-1-4244-6582-8
  • Type

    conf

  • DOI
    10.1109/ICICISYS.2010.5658640
  • Filename
    5658640