Abstract
This work proposes a novel platform for bringing a project from the concept to the tapeout stage in a short amount of time. An open-source and extendable RISC-V architecture is exploited to build a small area footprint core. This leads the research platform to be flexible in terms of design integration, while also allowing fast design cycles of research chips.
Original language | English |
---|---|
Title of host publication | IEEE International Symposium on Circuits and Systems, ISCAS 2022 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 2614-2615 |
Number of pages | 2 |
ISBN (Electronic) | 9781665484855 |
DOIs | |
State | Published - 2022 |
Event | 2022 IEEE International Symposium on Circuits and Systems, ISCAS 2022 - Austin, United States Duration: 27 May 2022 → 1 Jun 2022 |
Publication series
Name | Proceedings - IEEE International Symposium on Circuits and Systems |
---|---|
Volume | 2022-May |
ISSN (Print) | 0271-4310 |
Conference
Conference | 2022 IEEE International Symposium on Circuits and Systems, ISCAS 2022 |
---|---|
Country/Territory | United States |
City | Austin |
Period | 27/05/22 → 1/06/22 |
Bibliographical note
Publisher Copyright:© 2022 IEEE.
Funding
LEO-I is the first demonstration of how the PulpEnIX platform can be utilized to allow very fast design cycles for bringing a project from the idea to the tapeout stage. The platform is provided as either a soft or hard macro, including a CPU core, AMBA bus, and peripheral blocks for off-chip interfaces. A general purpose port (GPP) that hangs off the APB interconnect is provided to easily integrate research modules, such that a block owner can focus on their own design without having to design, verify, and implement the entire system. The platform also provides the option to include higher performance components, connected through an AXI interconnect, such as the secondary core integrated in the LEO-I test chip. The primary core is used to control the platform. It is an implementation of HAMSA-DI, a custom-designed, dual-issue RISC-V core, branched-off from the open-source RI5CY core (cv32e40p) [2]. The LEO-I test-chip includes the first fabricated version of HAMSA-DI. Using the open-source RISC-V architecture enables the integration of in-pipe research blocks within the core and custom-extensions for operating and demonstrating their capabilities.In the LEO-I test chip, a secondary core was provided for the demonstration of high-risk research features at the micro- architectural level within the core itself, without jeopardizing the entire platform. This “experimental core” can be configured during boot as AXI master to control the platform. The core is used for any integrated research module that requires proper bus architectures, memory, and CPU. If a specific target application is given, the hardware IP can be properly tailored to the bus architecture. Depending on the research modules, the I/O mapping, clock distribution, overall power, and related software architecture can be adjusted. Thanks to its flexibility and the reuse of blocks like the core, memory, and buses, our research platform offers a much faster and smoother design cycle than using a conventional “clean slate” design flow. However, the performance, power consumption, and area footprint of the final chip will vary depending on the research modules that are integrated. On this basis, the name LEO-I was adopted, standing for “Low Effort of Integration”. LEO-I Research Modules The proposed platform is specifically designed to enable the integration of a wide variety of research modules – digital or fully custom, within pipe or hanging off the system bus. The LEO-I test chip, integrates as many as ten such blocks, including units for testing custom memories and register files, true random number generators, cryptographic blocks, clockless sequential pipelining, analog blocks and more. Fig. 3 and 4 show the measurements of two research sub-blocks: a wave propagation circuit [3] and a gain-cell embedded dynamic random-access memory (GC-eDRAM) [4] [5]. The wave propagation module is a clockless Wave Propagated Pipelining (CWPP) that enables energy-efficient high-throughput computation without the power and area overheads of the internal registers required for standard sequential design [3]. Fig. 3 shows the wave propagated pipelining post-silicon validation, where the clock strobe is arriving within the stable data frame and the shmoo plot of customizable delay settings for correct clock strobe sampling. This research SoC integrated, a CWPP-based 8G MAC/sec dot-product (DP) accelerator, consuming as low as 497 fJ/MAC (2 TOPS/W). The DP unit is the first demonstration of a scalable CWPP design implemented with a CMOS-compatible flow [6]. As for the GC-eDRAM, Fig. 4 shows the post-silicon measurements of the memory retention time. This plot shows the degeneration of the data over time when measuring the retention time between a successful write and read operation between 0 to 75 clock cycles. Conclusions We proposed a novel system-on-chip research platform to integrate several research modules coming from different research groups into one chip with a standard interface. The fabricated experimental chip shows that our platform, through a reusable platform with relative ease of integration, allows more design freedom and reduces the long-time development of research prototypes. Acknowledgments We acknowledge Prof. Luca Benini’s group at ETH-Zurich for providing the FLL IP, as well as the open-source PULP platform. The tapeout was funded by the Israel Innovation Authority under the Kamin program and the Israel Ministry of Science and Technology Additional research was supported by the Israel Science Foundation. References: [1] L. Benini et al., “ PULP platform,” [Online], Available: https://pulp-platform.org/ [2] D. Schiavone et al., " cv32e40p (RI5CY) core," [Online], Available: https://github.com/openhwgroup/cv32e40p [3] Y. Kra, et al., "WP 2.0: Signoff-Quality Implementation and Validation of Energy-Efficient Clock-Less Wave Propagated Pipelining," DATE 2021 [4] R. Giterman, et al., "A 1-Mbit Fully Logic-Compatible 3T Gain-Cell Embedded DRAM in 16-nm FinFET," in IEEE SSCL 2020 [5] R. Giterman, et al., "An 800-MHz Mixed-VT 4T IFGC Embedded DRAM in 28-nm CMOS Bulk Process for Approximate Storage Applications," IEEE JSSC 2018 [6] Y. Kra and A. Teman, “Silicon-Proven Clockless Wave-Propagated Pipelining for High-Throughput, Energy-Efficient Processing,” ISCAS 2021 Late-breaking News (LBN)
Funders | Funder number |
---|---|
Israel Innovation Authority | |
Israel Ministry of Science and Technology Additional | |
Israel Science Foundation |