Abstract
Optimized hardware for the execution of large dot-product (DP) calculations is central to many of today's integrated circuits. These arithmetic blocks are often implemented with the parallel fused DP (FDP) approach, and to achieve high performance, are realized with a tree-based compression algorithm, using on commercially available synthesis macros. However, these macros are based on performance optimization of the gate-level netlist, and fail to take into account the consequences of the applied heuristics on the physical-implementation (layout) of these large circuits. In this article, we propose a physical-aware approach to FDP implementation based on the affinity between the logic gates that make up the gate-level structure. The proposed clustered DP (CDP) algorithm, enables the place and route tools to cluster gates with high-affinity, leading to higher placement utilization and lower routing congestion. DP calculations with up to 78 multipliers were implemented with a 65-nm CMOS standard cell library, providing power reduction of up to 63%, up to 60% lower area, and performance improvements as high as 2.5×, as compared to similar implementations based on commercial macros based on post-layout results.
Original language | English |
---|---|
Article number | 8772144 |
Pages (from-to) | 2886-2897 |
Number of pages | 12 |
Journal | IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems |
Volume | 39 |
Issue number | 10 |
DOIs | |
State | Published - Oct 2020 |
Bibliographical note
Publisher Copyright:© 1982-2012 IEEE.
Funding
Manuscript received November 8, 2018; revised February 20, 2019 and May 9, 2019; accepted July 8, 2019. Date of publication July 25, 2019; date of current version September 18, 2020. This work was supported by the HiPer Consortium through the MAGNET Program of the Israeli Innovation Authority and done in collaboration with SatixFy Israel, Ltd. This article was recommended by Associate Editor I. H. R. Jiang. (Corresponding author: Or Maltabashi.) The authors are with the Emerging Nanoscale Integrated Circuits and Systems Laboratories, Faculty of Engineering, Bar-Ilan University, Ramat Gan 5290002, Israel (e-mail: [email protected]; [email protected]). Digital Object Identifier 10.1109/TCAD.2019.2931185
Funders | Funder number |
---|---|
Israeli Innovation Authority | |
SatixFy Israel, Ltd. |
Keywords
- Clustered dot-product
- Wallace tree
- digital signal processing (DSP)
- high-speed
- low-power design
- multiplication algorithm
- multiplier
- physically aware multiplier
- physically aware synthesis
- place and route
- sum-of-products