• DocumentCode
    2193591
  • Title

    Improving trace cache effectiveness with branch promotion and trace packing

  • Author

    Patel, Sanjay Jeram ; Evers, Marius ; Patt, Yale N.

  • Author_Institution
    Dept. of Electr. Eng. & Comput. Sci., Michigan Univ., Ann Arbor, MI, USA
  • fYear
    1998
  • fDate
    27 Jun-1 Jul 1998
  • Firstpage
    262
  • Lastpage
    271
  • Abstract
    The increasing widths of superscalar processors are placing greater demands upon the fetch mechanism. The trace cache meets these demands by placing logically contiguous instructions in physically contiguous storage. As a result, the trace cache delivers instructions at a high rate by supplying multiple fetch blocks each cycle. In this paper we examine two techniques to improve the number of instructions delivered each cycle by the trace cache. The first technique, branch promotion, dynamically converts strongly biased branches into branches with static predictions. Because these promoted branches require no dynamic prediction, the branch predictor suffers less from the negative effects of interference. Branch promotion unlocks the potential of the second technique: trace packing. With trace packing, trace segments are packed with as many instructions as will fit, without regard to naturally occurring fetch block boundaries. With both techniques, the effective fetch rate of the trace cache jumps up 17% over a trace cache which implements neither on a machine where the execution engine has a very aggressive memory disambiguator; the performance of a machine using branch promotion and trace packing is on average 11% higher than a machine using neither technique
  • Keywords
    parallel architectures; performance evaluation; branch promotion; fetch mechanism; logically contiguous instructions; multiple fetch blocks; physically contiguous storage; superscalar processors; trace cache effectiveness; trace packing; trace segments; Bandwidth; Cache storage; Computer architecture; Decoding; Electrical capacitance tomography; Electronic switching systems; Engines; Etching; Laboratories; Read only memory;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Architecture, 1998. Proceedings. The 25th Annual International Symposium on
  • Conference_Location
    Barcelona
  • ISSN
    1063-6897
  • Print_ISBN
    0-8186-8491-7
  • Type

    conf

  • DOI
    10.1109/ISCA.1998.694786
  • Filename
    694786