Low Error-Rate Approximate Multiplier Design for DNNs with Hardware-Driven Co-Optimization Yao Lu Jide Zhang Su Zheng Zhen Li Lingli Wang

2025-05-02 0 0 323.18KB 5 页 10玖币

侵权投诉

Low Error-Rate Approximate Multiplier Design for

DNNs with Hardware-Driven Co-Optimization

Yao Lu, Jide Zhang, Su Zheng, Zhen Li, Lingli Wang

State Key Laboratory of ASIC & System, Fudan University, Shanghai, China

Emails: {yaolu20, szheng19, lizhen19, llwang}@fudan.edu.cn, zhangjd21@m.fudan.edu.cn

Abstract—In this paper, two approximate 3×3 multipliers are

proposed and the synthesis results of the ASAP-7nm process

library justify that they can reduce the area by 31.38% and

36.17%, and the power consumption by 36.73% and 35.66%

compared with the exact multiplier, respectively. They can be

aggregated with a 2×2 multiplier to produce an 8×8 multiplier

with low error-rate based on the distribution of DNN weights.

We propose a hardware-driven software co-optimization method

to improve the DNN accuracy by retraining. Based on the

proposed two approximate 3-bit multipliers, three approximate

8-bit multipliers with low error-rate are designed for DNNs.

Compared with the exact 8-bit unsigned multiplier, our design

can achieve a signiﬁcant advantage over other approximate

multipliers on the public dataset.

Index Terms—hardware-driven co-optimization, approximate

computing, low error-rate multiplier design, low DNN accuracy

loss

I. INTRODUCTION

Approximate computing has gradually become an energy-

saving solution for digital systems [1]. In some error-tolerant

applications that do not require high precision, such as im-

age processing, communication system, deep neural networks

(DNNs), etc., approximate arithmetic circuits can usually bring

the hardware cost advantages such as the area, power, and

critical path delay. In DNN applications, the accuracy loss

caused by approximation can be negligible by the careful co-

optimization of approximate multiplier architectures with the

DNN retraining software platform.

References [2] has shown that approximate multipliers can

effectively reduce the area and power consumption with small

precision loss. It is a common way to approximate multiplica-

tion by algebraic methods. A logarithm-based multiplication

system is introduced in [3], which approximates the logarithm

by the Mitchell Algorithm through a linear function. An

iterative logarithmic multiplier is introduced in [4], which

continuously approaches the exact value by controlling the

number of decimal iterations. A further derived truncated error

correction [5] is applied to the truncated iterative multiplier

in [6], which provides greater ﬂexibility than the static ap-

proximate multiplication, with the improvement of latency,

precision, and hardware cost.

Moreover, the research on approximate multiplication is no

longer limited to arithmetic algorithms but hardware archi-

tectures. Reference [7] describes in detail a low-power, high-

This work is supported by the National Natural Science Foundation of

China under grants 61971143 and 62174035.

performance approximate multiplier, SiEi, which improves

accuracy by compensating some of the errors. A multiplier

RoBA in [8] has the characteristics of high speed and energy

saving. It rounds the operand to the nearest exponent of

two. This method saves the intensive part of the multiplica-

tion operations at the cost of a high error rate. The error-

tolerant multiplier (ETM) of [9] is based on the truncation

of a multiplier into an accurate multiplication part for most-

signiﬁcant-bits (MSBs) and a non-multiplication part for least-

signiﬁcant-bits (LSBs). The method of modifying the low-

bit width multiplier based on the Karnaugh map (K-map)

proposed in [10] has been proven to effectively reduce the area

and critical path delay. This method is also used in [11] [12]

for the approximate 4-2 compressor, full adder, half adder, and

then applied in the reduction process of the Wallace multiplier

tree.

In [13] [14], there are different error metrics for approximate

multipliers: error distance (ED), mean error distance (MED),

normalized mean error distance (NMED), mean relative error

distance (MRED) and error rate (ER). These metrics are

useful to evaluate multipliers but not necessarily suitable for

various applications. Hence, DNN accuracy loss (DAL) is

adopted to reﬂect the accuracy of DNN caused by approximate

multipliers.

Through the aggregation of low bit-width multipliers, the

Wallace multiplier can effectively reduce the layer number of

adders to shorten the critical path. Reference [10] introduces

a 2×2 approximate multiplier, which has signiﬁcant improve-

ment in both area and power consumption but leads to a high

accuracy loss after aggregating into a large multiplier. Hence

two 3×3 approximate multiplier architectures are proposed,

which can be used for the partial product generation for

aggregating into large multipliers according to the distribution

of DNN weights. In DNN accelerators, 8-bit is a common

conﬁguration. The results of the 8-bit quantization experiment

in [15] are convictive and Eyeriss-v2 in [16] also uses 8-

bit quantization conﬁguration. Three 8×8 unsigned multiplier

architectures are then proposed based on 3-bit multipliers and

evaluated by the extended DNN platform based on [17].

With the proposed approximate multipliers, the evaluation

results show that the average DAL is about 0.4% and 12%

for LeNet on MNIST dataset [18] and CIFAR10 dataset

[19], respectively. Our platform then retrains the DNN by

regularization to improve DNN accuracy, which can help

reduce the average accuracy loss to 0.2% and 9%. In addition,

arXiv:2210.03916v2 [cs.AR] 16 Nov 2022

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

LowError-RateApproximateMultiplierDesignforDNNswithHardware-DrivenCo-OptimizationYaoLu,JideZhang,SuZheng,ZhenLi,LingliWangStateKeyLaboratoryofASIC&System,FudanUniversity,Shanghai,ChinaEmails:fyaolu20,szheng19,lizhen19,llwangg@fudan.edu.cn,zhangjd21@m.fudan.edu.cnAbstractInthispaper,twoapproximate3×...

展开>> 收起<<

Low Error-Rate Approximate Multiplier Design for DNNs with Hardware-Driven Co-Optimization Yao Lu Jide Zhang Su Zheng Zhen Li Lingli Wang.pdf

共5页,预览1页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Low Error-Rate Approximate Multiplier Design for DNNs with Hardware-Driven Co-Optimization Yao Lu Jide Zhang Su Zheng Zhen Li Lingli Wang

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: