• DocumentCode
    704733
  • Title

    Where does the time go? characterizing tail latency in memcached

  • Author

    Blake, Geoffrey ; Saidi, Ali G.

  • fYear
    2015
  • fDate
    29-31 March 2015
  • Firstpage
    21
  • Lastpage
    31
  • Abstract
    To function correctly Online, Data-Intensive (OLDI) services require low and consistent service times. Maintaining predictable service times entails requiring 99th or higher percentile latency targets across hundreds to thousands of servers in the data-center. However, to maintain the 99th percentile targets servers are routinely run well below full utilization. The main difficulty in optimizing a server to run closer to peak utilization and maintain predictable 99th percentile response latencies is identifying and mitigating the causes of a request missing the target service time. In practice this analysis is challenging as requests and responses overlap their execution with respect to one another and traverse multiple layers of software, user/kernel protection boundaries, and the hardware/software divide. Traditional profiling methods that record the time being spent in each function usually yield few clues as to where a bottleneck may be present due to the many layers of software each consuming only a small fraction of time each. In this work we analyze the end-to-end sources of latency in a Memcached server from the wire through the kernel into the application and back again. To do so, we develop a tool that utilizes the Linux SystemTap infrastructure to measure latency throughout the many software layers that make up the complete request and response path for Memcached. While memory copies and the Linux networking stack are often suggested as major contributors to latency, we find that the main cause of missing response latency guarantees is the formation of standing queues and the application´s inability to detect and remedy this situation.
  • Keywords
    Linux; cache storage; computer centres; parallel processing; storage management; Linux systemtap infrastructure; Memcached server; OLDI service; data center; end-to-end latency source; online data-intensive service; tail latency; Hardware; IP networks; Kernel; Linux; Probes; Servers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Performance Analysis of Systems and Software (ISPASS), 2015 IEEE International Symposium on
  • Conference_Location
    Philadelphia, PA
  • Type

    conf

  • DOI
    10.1109/ISPASS.2015.7095781
  • Filename
    7095781