• DocumentCode
    2093059
  • Title

    DM-PAS: A Data Mining Prefetching Algorithm for Storage System

  • Author

    Nijim, Mais ; Nijim, Yousef ; Sker, R. ; Reddy, Vamshi ; Raju, R.N.

  • Author_Institution
    Electr. Eng. & Comput. Sci., Texas A&M Kingsville, Kingsville, TX, USA
  • fYear
    2011
  • fDate
    2-4 Sept. 2011
  • Firstpage
    500
  • Lastpage
    505
  • Abstract
    This paper is motivated by a global online satellite images distribution system operated at the Earth Resources Observation and Science (EROS) center of the U.S Geological Survey. Fundamental objectives of EROS include, but are not limited to, building high-speed and cost-effective massive data processing and storage systems to support online satellite images distribution. Hybrid storage systems -- containing solid-state drives (SSD), hard disks (HDD), and tapes -- can provide an ideal data storage solution for a wide variety of data processing centers like EROS. Large-scale hybrid storage systems will become increasingly popular in the next few years for the following two reasons. First, highly accessed storage objects in a hybrid storage system can be prefetched and cached to high-speed storage components such as solid-state drives. SSD-based hybrid storage system can provide large storage capacity, high I/O performance and data reliability. Second, hybrid storage systems are cost-effective, because inexpensive tapes help in increasing storage capacities at very low cost. Transferring data back and forth among SSDs, HDDs, and tapes plays a critical role in achieving high I/O performance. Thus, we proposed data mining algorithms that can judiciously prefetch data. Our analytical model and the experimental results reveals that our data mining prefetching algorithm increase the performance of the hybrid storage systems.
  • Keywords
    Internet; data mining; geophysical image processing; storage management; Earth resources observation and science center; SSD-based hybrid storage system; U.S geological survey; data mining prefetching algorithm; data reliability; data transfer; global online satellite image distribution system; high speed cost effective massive data processing; high speed storage component; highly accessed storage object; solid state drive; storage capacity; Data mining; Hard disks; Nonhomogeneous media; Prefetching; Satellites; Servers; Storage area networks; Data Mining; Prefetching; Solid State Disks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing and Communications (HPCC), 2011 IEEE 13th International Conference on
  • Conference_Location
    Banff, AB
  • Print_ISBN
    978-1-4577-1564-8
  • Electronic_ISBN
    978-0-7695-4538-7
  • Type

    conf

  • DOI
    10.1109/HPCC.2011.71
  • Filename
    6063031