Abstract
Machine learning algorithms have become a major tool in various applications. The high-performance requirements on large-scale datasets pose a challenge for traditional von Neumann architectures. We present two machine learning implementations and evaluations on PRINS, a novel processing-in-storage system based on resistive content addressable memory (ReCAM). PRINS functions simultaneously as a storage and a massively parallel associative processor. PRINS processing-in-storage resolves the bandwidth wall faced by near-data von Neumann architectures, such as three-dimensional DRAM and CPU stack or SSD with embedded CPU, by keeping the computing inside the storage arrays, thus implementing in-data, rather than near-data, processing. We show that PRINS-based processing-in-storage architecture may outperform existing in-storage designs and accelerator-based designs. Multiple performance comparisons for the ReCAM processing-in-storage implementations of K-means and K-nearest neighbors are performed. Compared platforms include CPU, GPU, FPGA, and Automata Processor. We show that PRINS may achieve an order-of-magnitude speedup and improved power efficiency relative to all compared platforms.
Original language | English |
---|---|
Article number | 8275038 |
Pages (from-to) | 889-896 |
Number of pages | 8 |
Journal | IEEE Transactions on Nanotechnology |
Volume | 17 |
Issue number | 5 |
DOIs | |
State | Published - Sep 2018 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2002-2012 IEEE.
Keywords
- CAM
- Near-data processing
- RRAM
- associative processing
- memristors
- processing-in-memory
- processing-in-storage