Title : 
Service Middleware for Self-Managing Large-Scale Systems
         
        
            Author : 
Adam, Constantin ; Stadler, Rolf
         
        
            Author_Institution : 
Lab. for Commun. Networks, KTH R. Inst. of Technol., Stockholm
         
        
        
        
        
        
        
            Abstract : 
Resource management poses particular challenges in large-scale systems, such as server clusters that simultaneously process requests from a large number of clients. A resource management scheme for such systems must scale both in the in the number of cluster nodes and the number of applications the cluster supports. Current solutions do not exhibit both of these properties at the same time. Many are centralized, which limits their scalability in terms of the number of nodes, or they are decentralized but rely on replicated directories, which also reduces their ability to scale. In this paper, we propose novel solutions to request routing and application placement- two key mechanisms in a scalable resource management scheme. Our solution to request routing is based on selective update propagation, which ensures that the control load on a cluster node is independent of the system size. Application placement is approached in a decentralized manner, by using a distributed algorithm that maximizes resource utilization and allows for service differentiation under overload. The paper demonstrates how the above solutions can be integrated into an overall design for a peer-to-peer management middleware that exhibits properties of self-organization. Through complexity analysis and simulation, we show to which extent the system design is scalable. We have built a prototype using accepted technologies and have evaluated it using a standard benchmark. The testbed measurements show that the implementation, within the parameter range tested, operates efficiently, quickly adapts to a changing environment and allows for effective service differentiation by a system administrator.
         
        
            Keywords : 
client-server systems; computational complexity; distributed algorithms; middleware; peer-to-peer computing; resource allocation; application placement; client-server cluster; complexity analysis; distributed algorithm; peer-to-peer management middleware; request routing; resource utilization maximization; scalable resource management scheme; selective update propagation; self-managing large-scale system; service differentiation; Control systems; Distributed algorithms; Large-scale systems; Middleware; Peer to peer computing; Resource management; Routing; Scalability; Size control; System testing;
         
        
        
            Journal_Title : 
Network and Service Management, IEEE Transactions on
         
        
        
        
        
            DOI : 
10.1109/TNSM.2007.021103