Transcript (.ppt)
Accuracy-Configurable Adder for Approximate Arithmetic Designs Andrew B. Kahng, Seokhyeong Kang VLSI CAD LABORATORY, UC San Diego 49th Design Automation Conference June 6th, 2012 UC San Diego / VLSI CAD Laboratory -1- Outline Background and Motivation Accuracy Configurable Adder Design Experimental Setup and Results Conclusions and Ongoing Works -2- Why Approximate Designs? Threats to traditional IC design approach ... Extreme variations variations:/ Reliability issues / Cost: PVT variation uncertainty lead to design overhead Approximate designs Reliability issues: Relaxing the(NBTI, requirement of Soft correctness can Hard errors latchup), errors (α-particle) dramatically reduce costs of the design Cost: What is the squareof root of 10accuracy ? Cost (power/performance) perfect is too high! “a little more than three” Approximate designs Relaxing the requirement“3.162278....” of correctness can dramatically reduce costs of the design Approximation could be faster and more powerful -3- Previous Approximate Adders Lu et al. IEEE Computer 2004 Faster adder w/ shorter carry chain High performance with small error rate Large area overhead: not applicable for low energy design Zhu et al. TVLSI 2010 ETAI : accurate part + inaccurate part Reduce error size Error rate is high Output accuracy is fixed benefits can be limited by required accuracy -4- Our Work: Accuracy-Configurable Approximate Adder normalized power accurate mode How power benefits can be achieved … accurate design 1.0 approximate mode required accuracy 80% 100% 90% event occurred accuracy configurable design 80% time Accuracy-configurable design adapts to changing requirements by using different modes in each situation -5- Our Work: Accuracy-Configurable Approximate Adder normalized power accurate mode accurate design 1.0 approximate mode required accuracy 80% 100% 90% accuracy configurable design How power benefits can be achieved … 80% event occurred time Accuracy-configurable approximate adder approximate adder accuracy: 90% Mode 1: turn-off ECC-1, ECC-2 error collection error collection (ECC-1) (ECC-2) accuracy: 95% Mode 2: turn-off ECC-2 accuracy: 100% Mode 3: turn-on All ECC -6- Outline Background Motivation Accuracy Configurable Adder Design Experimental Setup and Results Conclusions and Ongoing Works -7- Approximate Adder Implementation AH=A[15:8], AM=A[11:4], AL=A[7:0] A[15] carry 8-bit ‘ adder 16-bit adder case SUMH SUM[16] AH+BH SUM[15:12] SUM[11:8] 8-bit adder SUMM SUM[7:4] AM+BM SUM[3:0] A[0] A[15:0] B[15:0] 8-bit adder SUML SUM AL+BL Carry chain is cut to reduce critical path delay Sub-adders generate results of partial summation Middle sub-adder improves accuracy (error 50% 5.5%) -8- Approximate Adder Implementation k carry N: bit width, k: ½ carry-chain depth N-bit adder case A [N-1:N-k] A [N-k-1:N-2k] A [N-2k-1:N-3k] A [N-2k-1:N-3k] B [N-1:N-k] B [N-k-1:N-2k] B [N-2k-1:N-3k] B [N-2k-1:N-3k] SUM [N-k-1:N-2k] SUM [N-2k-1:N-3k] SUM [N-1:N-k] carry Probability of correct result : Estimation over CLA (N=16) K Approximate adder can be configured with “k” 2 3 4 5 6 Min. clock cycle 0.5 0.65 0.75 0.83 0.89 area 0.87 1.05 1.12 1.15 1.12 power 0.44 0.68 0.84 0.95 1.00 pass rate 0.554 0.829 0.942 0.982 0.995 -9- Error Detection and Correction approximate adder EDC circuit SUMapprox IN sub-adderi OUT sumi incrementor sub-adderi+1 SUMcorrect errori error data stall Error can be detected and corrected with small overhead carryi+1 Variable latency operation Error detection: ‘and’ gates Error correction: incrementor circuit Error detection and correction can take more time than critical path delay of “sub-adder”; the throughput can be reduced -10- Accuracy Configuration with Pipeline Stage 1 A Stage 2 correction on S1 approximate adder B Stage 3 errors on S1 SUM S3 S2 S1 approximate S0 S3 S2 S1 correction on S2 correction on S3 errors on S2 errors on S3 S0 correct approximate correct Each stage generates a result with different accuracy Can turn off later stages with power gating according to accuracy requirement Stage 4 S3 S2 S1 S0 S3 approx. correct S2 SUMcorrect S1 S0 correct Config. Powergating Accuracy Power reduction Mode-1 None 1.000 -11.5% Mode-2 Stage 4 0.960 12.4% Mode-3 Stage-3, 4 0.925 31.0% Mode-4 Stage-2, 3, 4 0.900 51.6% -11- Outline Background Motivation Accuracy Configurable Adder Design Experimental Setup and Results Conclusions and Ongoing Works -12- Experimental Setup and Metrics Experimental Setup Library: TSMC 65GP Implementation: Synopsys Design Compiler Simulation: Cadence NC-SIM Input patterns: random data and actual data Library preparation: Cadence Library Characterizer Accuracy Metrics Metric ACCamp ACCinf Definition 1-|Rc-Re|/Rc 1-Be/Bw Data type Amplitude data Information data Rc and Re : correct and obtained results Be: number of error bits, Bw: bit-width of data -13- Approximate Adder Comparison Accuracy vs. power consumption Image smoothing (Gaussian filter) (a) (d) (b) (e) (c) (f) (a) (b) (c) (d) (e) (f) Original image Accurate adder ACA (PSNR 24.5dB) ETAI (25.3dB) ETAII (16.2dB) LU (11.1dB) (c)~(f) have 50% power of accurate adder (b) * ETAI cannot detect and correct errors -14- Approximate Adder Comparison 1.000 0.900 1.000 Voltage scaling (1.0V~0.6V) 0.800 ACA adder CLA Lu's adder ETAI ETAIIM 0.700 0.600 0.500 0.400 2.00E-04 total power (W) 4.00E-04 6.00E-04 8.00E-04 0.900 ACCinf Accuracy vs. power consumption w/voltage scaling ACCamp 0.800 ACA adder CLA Lu's adder ETAI ETAIIM 0.700 0.600 0.500 0.400 2.00E-04 total power (W) 4.00E-04 6.00E-04 8.00E-04 ACA adder shows fine results (accuracy vs. power) on both ACCamp and ACCinf metrics -15- Accuracy Configuration and Power Saving Power saving from voltage scaling + mode change 4-stage 32-bit adder case Accuracy: 1.0 → 0.9 4.00E-03 mode change 3.00E-03 Conventional pipelined adder ACA adder (mode 1) ACA adder (mode 2) ACA adder (mode 3) ACA adder (mode 4) 2.00E-03 1.00E-03 0.00E+00 0.80 0.85 0.90 0.95 1.00 ACCinf Accuracy configuration w/ mode change is more effective than w/ voltage scaling mode change 5.00E-03 voltage scaling total power consumption (W) 6.00E-03 voltage scaling 4X reduction accurate result 7.00E-03 -16- Accuracy Configuration and Power Saving Power consumption when accuracy requirement is varying (w/ SPEC 2006 benchmarks) 0.8 0.6 0.4 0.2 0 0.95 Accuracy 1.00 mode-4 mode-3 mode-2 mode-1 High accuracy Normalized power consumption 1 Average 30% power savings over no accuracy configuration -17- Outline Background Motivation Accuracy Configurable Adder Design Experimental Setup and Results Conclusions and Ongoing Works -18- Conclusions and Ongoing Works Conclusions We proposed accuracy-configurable approximate (ACA) adder, which can adapt to changing accuracy requirement ACA can provide 30% power reduction with accuracy configuration during runtime Ongoing Works Accuracy-configurable design for other arithmetic units (multiplier, divider) Automated synthesis flow (minimize power under the required accuracy) RTL Required accuracy exact adder approximate adder Accuracy estimation Synthesis -19- Thank You! -20- Accuracy-Configurable Approximate Design Required accuracy can change during runtime Idea of High-Efficiency Math highlighted by Intel Labs at ISSCC-2012 Variable-precision floating point unit w/ accuracy tracking : 24-bit 12-bit 6-bit as needed Accuracy-configurable design adapts to changing requirements, maximizing benefits of approximate design paradigm accurate mode normalized power Variable-precision Mantissa accurate design 1.0 approximate mode required accuracy 80% 100% 90% event occurred accuracy configurable design 80% time -21-