Title :
A study of throughput degradation following single node failure in a data sharing system
Author :
Bowen, N.S. ; Roy-Chowdhury, A.
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
The data sharing approach to building distributed database systems is becoming more common because of its potentially higher processing power and flexibility compared to data partitioning. However, due to the large amounts of hardware and complex software involved, the likelihood of a single node failure in the system increases. Following a single node failure, some processing has to be done to determine the set of locks held by transactions which were executing at the failed node. These locks cannot be released until database recovery has completed on the failed node. This phenomenon can cause throughput degradation even if the processing power on the surviving nodes is adequate to handle all incoming transactions. This paper studies the throughput dropoff behavior following a single node failure in a data sharing system through simulations and analytical modeling. The analytical model reveals several important factors affecting post-failure behavior and is shown to match simulations quite accurately. The effect of hot locks (locks which are frequently accessed) on post-failure behavior is observed. Simulations are performed to observe system behavior after the set of locks held by transactions on the failed node has been determined and show that if the delay in obtaining this information is too large, the system is prone to thrashing.<>
Keywords :
distributed databases; fault tolerant computing; system recovery; transaction processing; analytical modeling; data partitioning; data sharing system; distributed database systems; hot locks; post-failure behavior; single node failure; thrashing; throughput degradation; Analytical models; Availability; Database systems; Degradation; Failure analysis; Hardware; Internet; Power system modeling; Throughput; Transaction databases;
Conference_Titel :
Fault-Tolerant Computing, 1994. FTCS-24. Digest of Papers., Twenty-Fourth International Symposium on
Conference_Location :
Austin, TX, USA
Print_ISBN :
0-8186-5520-8
DOI :
10.1109/FTCS.1994.315629