• DocumentCode
    2385379
  • Title

    A Framework for Statistical Analysis of Datasets on Heterogeneous Clusters

  • Author

    Carino, R.L. ; Banicescu, Joana

  • Author_Institution
    Center for Comput. Sci., Mississippi State Univ.
  • fYear
    2005
  • fDate
    Sept. 2005
  • Firstpage
    1
  • Lastpage
    9
  • Abstract
    This paper proposes a framework for the statistical analysis of multiple related datasets on heterogeneous clusters. The set of processors assigned to the framework are partitioned into groups according to rack locations, with the group sizes being chosen to match the degree of concurrency in the analysis procedure. The datasets are initially divided among the groups. Dynamic loop scheduling is employed to address load imbalance arising from the differences in computational powers of groups, the variability of dataset sizes, and the unpredictable irregularities in the cluster environment. Results of preliminary tests indicate the effectiveness of the framework in fitting gamma-ray burst datasets with vector functional coefficient autoregressive time series models on 64 processors of a heterogeneous general-purpose Linux cluster
  • Keywords
    Linux; autoregressive processes; concurrency control; processor scheduling; resource allocation; statistical analysis; time series; workstation clusters; Linux cluster; autoregressive time series models; concurrency; dynamic loop scheduling; gamma-ray burst datasets; heterogeneous clusters; load imbalance; statistical analysis; Computer networks; Computer science; Concurrent computing; Costs; Data analysis; Gamma ray bursts; Level control; Load management; Processor scheduling; Statistical analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing, 2005. IEEE International
  • Conference_Location
    Burlington, MA
  • ISSN
    1552-5244
  • Print_ISBN
    0-7803-9486-0
  • Electronic_ISBN
    1552-5244
  • Type

    conf

  • DOI
    10.1109/CLUSTR.2005.347019
  • Filename
    4154147