DocumentCode
2093059
Title
DM-PAS: A Data Mining Prefetching Algorithm for Storage System
Author
Nijim, Mais ; Nijim, Yousef ; Sker, R. ; Reddy, Vamshi ; Raju, R.N.
Author_Institution
Electr. Eng. & Comput. Sci., Texas A&M Kingsville, Kingsville, TX, USA
fYear
2011
fDate
2-4 Sept. 2011
Firstpage
500
Lastpage
505
Abstract
This paper is motivated by a global online satellite images distribution system operated at the Earth Resources Observation and Science (EROS) center of the U.S Geological Survey. Fundamental objectives of EROS include, but are not limited to, building high-speed and cost-effective massive data processing and storage systems to support online satellite images distribution. Hybrid storage systems -- containing solid-state drives (SSD), hard disks (HDD), and tapes -- can provide an ideal data storage solution for a wide variety of data processing centers like EROS. Large-scale hybrid storage systems will become increasingly popular in the next few years for the following two reasons. First, highly accessed storage objects in a hybrid storage system can be prefetched and cached to high-speed storage components such as solid-state drives. SSD-based hybrid storage system can provide large storage capacity, high I/O performance and data reliability. Second, hybrid storage systems are cost-effective, because inexpensive tapes help in increasing storage capacities at very low cost. Transferring data back and forth among SSDs, HDDs, and tapes plays a critical role in achieving high I/O performance. Thus, we proposed data mining algorithms that can judiciously prefetch data. Our analytical model and the experimental results reveals that our data mining prefetching algorithm increase the performance of the hybrid storage systems.
Keywords
Internet; data mining; geophysical image processing; storage management; Earth resources observation and science center; SSD-based hybrid storage system; U.S geological survey; data mining prefetching algorithm; data reliability; data transfer; global online satellite image distribution system; high speed cost effective massive data processing; high speed storage component; highly accessed storage object; solid state drive; storage capacity; Data mining; Hard disks; Nonhomogeneous media; Prefetching; Satellites; Servers; Storage area networks; Data Mining; Prefetching; Solid State Disks;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing and Communications (HPCC), 2011 IEEE 13th International Conference on
Conference_Location
Banff, AB
Print_ISBN
978-1-4577-1564-8
Electronic_ISBN
978-0-7695-4538-7
Type
conf
DOI
10.1109/HPCC.2011.71
Filename
6063031
Link To Document