D. Parikh, K. Skadron, Y. Zhang, M. Barcella, and M. Stan.
In Proc. of the 2002 International Symposium on High-Performance Computer Architecture, February, 2002, Cambridge, MA.
Abstract
This paper explores the role of branch predictor organization in
power/energy/performance tradeoffs for processor design. We find that as a
general rule, to reduce overall energy consumption in the processor it
is worthwhile to spend more power in the branch predictor if
this results in more accurate predictions that improve running time.
Two techniques, however, provide substantial reductions in power
dissipation without harming accuracy. Banking reduces the
portion of the branch predictor that is active at any one time. And a
new on-chip structure, the prediction probe detector (PPD), can
use pre-decode bits to entirely eliminate unnecessary predictor and
branch target buffer (BTB) accesses. Despite the extra power that must be spent
accessing
the PPD, it reduces local predictor power and energy dissipation by
about 45% and overall processor power and energy dissipation by
5-6%.