Abstract
Near-data in-memory processing research has been gaining momentum in recent years. Typical processing-in-memory architecture places a single or several processing elements next to a volatile memory, enabling processing without transferring data to the host CPU. The increased bandwidth to and from volatile memory leads to performance gain. However processing-in-memory does not alleviate von Neumann bottleneck for big data problems, where datasets are too large to fit in main memory. We present a novel processing-in-storage system based on Resistive Content Addressable Memory (ReCAM). It functions simultaneously as a mass storage and as a massively parallel associative processor. ReCAM processing-in-storage resolves the bandwidth wall by keeping computation inside the storage arrays, without transferring it up the memory hierarchy. We show that ReCAM based processing-in-storage architecture may outperform existing processing-in-memory and accelerator based designs. ReCAM processing-in-storage implementation of Smith-Waterman DNA sequence alignment reaches a speedup of almost five over a GPU cluster. An implementation of in-storage inline data deduplication is presented and shown to achieve orders of magnitude higher throughput than traditional CPU and DRAM based systems.
Original language | English |
---|---|
Pages (from-to) | 99-116 |
Number of pages | 18 |
Journal | Supercomputing Frontiers and Innovations |
Volume | 4 |
Issue number | 3 |
DOIs | |
State | Published - 2017 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© The Authors 2017.
Keywords
- Associative Processing
- Content Addressable Memory
- In-Storage Processing
- Memristors