Title :
On Making Transactional Applications Resilient to Data Corruption Faults
Author :
Mohamedin, Mohamed ; Palmieri, Roberto ; Ravindran, Binoy
Author_Institution :
Virginia Tech, Blacksburg, VA, USA
Abstract :
Multicore architectures are becoming increasingly prone to transient faults and data corruption. Relying on a multicore architecture is the common solution for increasing performance and scalability of core applications including transactional applications. In this paper we present SoftX, a low-invasive protocol for supporting execution of transactional applications relying on speculative processing and dedicated committer threads. Upon starting a transaction, SoftX forks a number of threads running the same transaction independently. The commit phase is handled by dedicated threads for optimizing synchronization´s overhead. We conduct an evaluation study showing the performance obtained with the implementation of SoftX on a 48 cores AMD machine, running List, Bank and TPC-C benchmarks. Results reveal better performance than classical replication-based fault-tolerant systems and limited overhead with respect to non fault-tolerant protocols. We ported SoftX to a message-passing architecture, Tilera TILE-Gx. Hardware message-passing is an important emerging trend in multicore architectures. Our experiments on Tilera show that SoftX is still more efficient than replication.
Keywords :
benchmark testing; fault tolerant computing; message passing; multiprocessing systems; parallel architectures; performance evaluation; synchronisation; AMD machine; Bank benchmarks; Hardware message-passing; List benchmarks; SoftX forks; TPC-C benchmarks; Tilera TILE-Gx; core applications; data corruption faults; low-invasive protocol; message-passing architecture; multicore architecture; nonfault-tolerant protocols; replication-based fault-tolerant systems; speculative processing; synchronization overhead; transactional applications; transient faults; Computer architecture; Fault tolerance; Fault tolerant systems; Hardware; Instruction sets; Message systems; Transient analysis; Fault tolerance; Multicore processing; Transactional systems;
Conference_Titel :
Network Computing and Applications (NCA), 2014 IEEE 13th International Symposium on
Conference_Location :
Cambridge, MA
Print_ISBN :
978-1-4799-5392-9
DOI :
10.1109/NCA.2014.39