• DocumentCode
    643186
  • Title

    Efficient parallelization of batch pattern training algorithm on many-core and cluster architectures

  • Author

    Turchenko, Volodymyr ; Bosilca, George ; Bouteiller, Aurelien ; Dongarra, Jack

  • Author_Institution
    Innovative Comput. Lab., Univ. of Tennessee, Knoxville, TN, USA
  • Volume
    02
  • fYear
    2013
  • fDate
    12-14 Sept. 2013
  • Firstpage
    692
  • Lastpage
    698
  • Abstract
    The experimental research of the parallel batch pattern back propagation training algorithm on the example of recirculation neural network on many-core high performance computing systems is presented in this paper. The choice of recirculation neural network among the multilayer perceptron, recurrent and radial basis neural networks is proved. The model of a recirculation neural network and usual sequential batch pattern algorithm of its training are theoretically described. An algorithmic description of the parallel version of the batch pattern training method is presented. The experimental research is fulfilled using the Open MPI, Mvapich and Intel MPI message passing libraries. The results obtained on many-core AMD system and Intel MIC are compared with the results obtained on a cluster system. Our results show that the parallelization efficiency is about 95% on 12 cores located inside one physical AMD processor for the considered minimum and maximum scenarios. The parallelization efficiency is about 70-75% on 48 AMD cores for the minimum and maximum scenarios. These results are higher by 15-36% (depending on the version of MPI library) in comparison with the results obtained on 48 cores of a cluster system. The parallelization efficiency obtained on Intel MIC architecture is surprisingly low, asking for deeper analysis.
  • Keywords
    application program interfaces; batch processing (computers); learning (artificial intelligence); message passing; multilayer perceptrons; multiprocessing systems; parallel architectures; workstation clusters; AMD cores; Intel MIC architecture; Intel MPI message passing libraries; MPI library; Mvapich; Open MPI; batch pattern training algorithm; batch pattern training method; cluster architectures; cluster system; many-core AMD system; many-core architectures; many-core high performance computing systems; multilayer perceptron; parallel batch pattern back propagation training algorithm; parallel version; parallelization; physical AMD processor; radial basis neural networks; recirculation neural network; sequential batch pattern algorithm; Algorithm design and analysis; Artificial neural networks; Clustering algorithms; Lips; Microwave integrated circuits; Neurons; Training; many-core system; parallel batch pattern training; parallelization efficiency; recirculation neural network;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Data Acquisition and Advanced Computing Systems (IDAACS), 2013 IEEE 7th International Conference on
  • Conference_Location
    Berlin
  • Print_ISBN
    978-1-4799-1426-5
  • Type

    conf

  • DOI
    10.1109/IDAACS.2013.6663014
  • Filename
    6663014