DocumentCode
720542
Title
Reconfigurations for Processor Arrays with Faulty Switches and Links
Author
Jigang Wu ; Longting Zhu ; Peilan He ; Guiyuan Jiang
Author_Institution
Sch. of Comput. Sci. & Technol., Guangdong Univ. of Technol., Guangzhou, China
fYear
2015
fDate
4-7 May 2015
Firstpage
141
Lastpage
148
Abstract
Large scale multiprocessor array suffers from frequent hardware defects or soft faults due to overheating, overload or occupancy by other running applications. To obtain fault-free logical array, reconfiguration techniques are proposed to reuse the fault-free PEs by changing the interconnection among PEs. Previous research has worked on this topic but assume that switches and links are fault-free. In this paper, we consider faults not only on the processing elements (PEs) but also on the switches and links, and develop efficient algorithms to construct as large as possible logical arrays with optimized networks length. To deal with the faults on switches and links, an efficient pre-processing procedure is designed, in which switch faults are transformed into link faults, and then faulty links are classified into several categories to handle. Then, we propose an efficient algorithm, A-MLA, to produce as many as possible logical columns which are then combined to form a two dimensional processor array. After that, we propose an algorithm A-TMLA to reduce the interconnection length of the logical array obtained by algorithm A-MLA, as short interconnect leads to small communication latency and power consumption. Extensive experimental results show that, even with switch faults and link faults, our approach can produce larger logical fault-free arrays with shorter interconnection length, compared to the state-of-the-art.
Keywords
multiprocessing systems; power aware computing; power consumption; switches; 2D processor array; A-MLA; A-TMLA; fault-free PE; fault-free logical array; faulty switches; large scale multiprocessor array; larger logical fault-free arrays; links; power consumption; processing elements; processor array reconfigurations; Fault tolerant systems; Joining processes; Logic arrays; Parallel processing; Power demand; Redundancy; Fault-tolerance; interconnection length; link faults; processor array; switch faults;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on
Conference_Location
Shenzhen
Type
conf
DOI
10.1109/CCGrid.2015.47
Filename
7152480
Link To Document