• DocumentCode
    2604195
  • Title

    Optimizing protocol parameters to large scale PC cluster and evaluation of its effectiveness with parallel data mining

  • Author

    Oguchi, Masato ; Shintani, Takahiko ; Tamura, Takayuki ; Kitsuregawa, Masaru

  • Author_Institution
    Inst. of Ind. Sci., Tokyo Univ., Japan
  • fYear
    1998
  • fDate
    28-31 Jul 1998
  • Firstpage
    34
  • Lastpage
    41
  • Abstract
    PC clusters have been studied intensively for next-generation large scale parallel computers. ATM technology is a strong candidate as a de facto standard of high speed communication networks. Therefore an ATM connected PC cluster is a very promising platform from the cost/performance point of view, as a future high performance computing environment. An ATM connected PC cluster consisting of 100 PCs is reported, and characteristics of a transport layer protocol for the PC cluster are evaluated. Point-to-point communication performance is measured and discussed when a TCP window size parameter is changed. Retransmission caused by cell loss at the ATM switch is analyzed, and parameters of the retransmission mechanism suitable for parallel processing on the large scale PC cluster are clarified. From the viewpoint of applications, data intensive applications such as data mining and ad-hoc query processing in databases are considered to be very important for massively parallel processors, in addition to conventional scientific calculations. Thus, investigating the feasibility of such applications on an ATM connected PC cluster is quite meaningful. Parallel data mining is implemented and evaluated on the cluster. The default TCP protocol cannot provide good performance, since a lot of collisions happen during all-to-all multicasting executed on the large scale PC cluster. Using TCP parameters according to the proposed optimization, sufficient performance improvement is achieved for parallel data mining on 100 PCs
  • Keywords
    asynchronous transfer mode; knowledge acquisition; local area networks; parallel machines; performance evaluation; query processing; transport protocols; ATM connected PC cluster; ATM switch; TCP window size parameter; ad-hoc query processing; all-to-all multicasting; cell loss; data intensive applications; databases; high speed communication networks; large scale PC cluster; large scale parallel computers; parallel data mining; point-to-point communication performance; protocol parameter optimisation; retransmission; scientific calculation; transport layer protocol; Asynchronous transfer mode; Communication networks; Communication standards; Concurrent computing; Costs; Data mining; Large-scale systems; Personal communication networks; Protocols; Switches;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Distributed Computing, 1998. Proceedings. The Seventh International Symposium on
  • Conference_Location
    Chicago, IL
  • ISSN
    1082-8907
  • Print_ISBN
    0-8186-8579-4
  • Type

    conf

  • DOI
    10.1109/HPDC.1998.709950
  • Filename
    709950