DocumentCode
2307107
Title
A study of Java virtual machine scalability issues on SMP systems
Author
Cao, Zhongbo ; Huang, Wei ; Chang, J. Morris
Author_Institution
Dept. of Electr. & Comput. Eng., Iowa State Univ., Ames, IA, USA
fYear
2005
fDate
6-8 Oct. 2005
Firstpage
119
Lastpage
128
Abstract
This paper studies the scalability issues of Java virtual machine (JVM) on symmetrical multiprocessing (SMP) systems. Using a cycle-accurate simulator, we evaluate the performance scaling of multithreaded Java benchmarks with the number of processors and application threads. By correlating low-level hardware performance data to two high-level software constructs: thread types and memory regions, we present in detail the performance analysis and study the potential performance impacts of memory system latencies and resource contentions on scalability. Several key findings are revealed through this paper. First, among the memory access latency components, the primary portion of memory stalls are produced by L2 cache misses and cache-to-cache transfers. Second, among the regions of memory, Java heap space produces most memory stalls. Additionally, a large majority of memory stalls occur in application threads, as opposed to other JVM threads. Furthermore, we find that increasing the number of processors or application threads, independently of each other, leads to increases in L2 cache miss ratio and cache-to-cache transfer ratio. This problem can be alleviated by using a thread-local heap or allocation buffer which can improve L2 cache performance. For certain benchmarks such as Raytracer, their cache-to-cache transfers, mainly dominated by false sharing, can be significantly reduced. Our experiments also show that a thread-local allocation buffer with a size between 16KB and 256KB often leads to optimal performance.
Keywords
Java; benchmark testing; cache storage; multi-threading; multiprocessing systems; resource allocation; software performance evaluation; virtual machines; Java heap space; Java virtual machine scalability; L2 cache misses; allocation buffer; cache-to-cache transfers; cycle-accurate simulator; memory system latencies; multithreaded Java benchmarks; performance scaling evaluation; resource contentions; symmetrical multiprocessing systems; thread-local heap; Computational modeling; Delay; Java; Kernel; Operating systems; Performance analysis; Processor scheduling; Scalability; Virtual machining; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Workload Characterization Symposium, 2005. Proceedings of the IEEE International
Print_ISBN
0-7803-9461-5
Type
conf
DOI
10.1109/IISWC.2005.1526008
Filename
1526008
Link To Document