Computer Science ›› 2017, Vol. 44 ›› Issue (3): 195-201.doi: 10.11896/j.issn.1002-137X.2017.03.042

Previous Articles     Next Articles

Efficient BTB Based on Taken Trace

XIONG Zhen-ya, LIN Zheng-hao and REN Hao-qi   

  • Online:2018-11-13 Published:2018-11-13

Abstract: Computer architecture is beset with two opposing things:performance and energy consumption.To reduce the increasing energy consumption of embedded processor,we proposed a taken trace branch target buffer (TG-BTB) which is an energy efficient BTB scheme for embedded processors.Unlike the conventional BTB scheme,which requires lookup BTB every instruction fetch,the TG-BTB need lookup BTB only when the trace is a taken trace.This structure dynamically analyzes the trace behavior during program execution,and TG-BTB can achieve lookup BTB per taken trace and reduce the energy consumption of BTB lookup.In the process of dynamic analyzing,TG-BTB detects the instruction interval between two taken instructions firstly,and then stores this value into TG-BTB.Finally,the scheme determines to perform BTB lookup or not according to the instruction interval.The experimental results demonstrate TG-BTB achieves 81% energy consumption reduction compared to the conventional BTB scheme.

Key words: Taken trace,Instruction Interval,BTB,Energy consumption

[1] PERLEBERG C,SMITH A.Branch target buffer design and optimizations[J].IEEE Transactions on Computers,1993,2(4):396-412.
[2] MANNE S,KLAUSER A,GRUNWALD D.Pipeline gating:speculation control for energy reduction[C]∥Proceedings of International Symposium on Computer Architecture.1998:132-141.
[3] BONANNO J,COLLURA A,LIPETZ D,et al.Two level bulk preload branch prediction[C]∥Proceedings of the IEEE International Symposium on High Performance Computer Architecture.2013:71-82.
[4] KEETON K,PATTERSON D A,HE Y Q,et al.Performancecharacterization of a Quad Pentium Pro SMP using OLTP workloads[C]∥Proceedings of International Symposium on Compu-ter Architecture.1998:15-26.
[5] ANNAVARAM M,DIEP T,SHEN J.Branch behavior of acommercial OLTP workload on Intel IA32 processors[C]∥Proceedings of International Conference on Computer Design.2002:242-248.
[6] PYNE S,PAL A.Branch Target Buffer Energy ReductionThrough Efficient Multiway Branch Translation Techniques[J].Journal of Low Power Electronics,2012,8(5):604-623.
[7] HILGENDORF R B,HEIM G J,ROSENSTIEL W.Evaluation of branch-prediction methods on traces from commercial applications[J].IBM Journal of Research and Development,1999,3:579-593.
[8] SUSSENGUTH E H.INSTRUCTION SEQUENCE CON-TROL:US,US3559183.http://www.google.com/patents/US3559183.
[9] LEE J,SMITH A.Branch Prediction Strategies and Branch Target Buffer Design[J].Computer,1984,7(1):6-22.
[10] CASZZA J.First the Tick,Now the Tock:Intel Microarchitecture[R].Nehalem,2009.
[11] DRIESEN K,HLZLE U.The cascaded predictor:economical and adaptive branch target prediction[C]∥Proceedings of International Symposium on Microarchitecture.1998:249-258.
[12] FAGIN B,RUSSELL K.Partial resolution in branch target buf-fers[C]∥Proceedings of International Symposium on Microarchitecture.1995:193-198.
[13] KOBAYASHI R,YAMADA Y,ANDO H,et al.A Cost-Effective Branch Target Buffer with a Two-Level Table Organization[C]∥Proceedings of International Symposium of Low-Powerand High-Speed Chips.1999:285-285.
[14] KAELI D R,EMMA P G.Branch history table prediction of mo-ving target branches due to subroutine returns[C]∥Proceedings of International Symposium on Computer Architecture.1991:34-42.
[15] JOAO J A,MUTLU O,KIM H,et al.Improving the performan-ce of object-oriented languages with dynamic predication of indirect jumps[C]∥Proceedings of International Conference on Architectural Support for Programming Languages and Operating.2008:80-85.
[16] SEZNEC A,FELIX V,KRISHNAN V,et al.Design tradeoffsfor the Alpha EV8conditional branch predictor[C]∥Procee-dings of International Symposium on Computer Architecture.2002:295-306.
[17] WANG G P,HU X D,YIN F,et al.Research and Design of HashIndexing Mechanism for BTB[J].Journal of Computer Research &Development,2014,1(9):2003-2011.(in Chinese) 王国澎,胡向东,尹飞,等.BTB索引散列算法的研究与设计[J].计算机研究与发展,2014,1(9):2003-2011.
[18] ORAILOGLU A,PETROV P.Low-Power Data Memory Communication for Application-Specific Embedded Processors[C]∥International Symposium on System Synthesis.2002:219-224.
[19] PARIKH D,SKADRON K,ZHANG Y,et al.Power-AwareBranch Prediction:Characterization and Design[J].IEEE Transa-ctions on Computers,2004,3(2):168-186.
[20] SMITH J E,GOODMAN J R.A Study of Instruction Cache Organizations and Replacement Policies[J].Acm Sigarch Computer Architecture News,1983,1(3):132-137.
[21] KAYNAK C,GROT B,FALSAFI B.Confluence:unified in-struction supply for scale-out servers[C]∥International Symposium on Microarchitecture.ACM,2015:166-177.
[22] DALLY W J,BALFOUR J,BLACK-SHAFFER D,et al.Efficient Embedded Computing[J].Computer,2008,1(7):27-32.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!