PiDRAM: An FPGA-based Framework for End-to-end Evaluation of Processing-in-DRAM Techniques
Abstract
PiDRAM is the first flexible, end-to-end, and open-source FPGA-based framework enabling the evaluation of real Processing-in-DRAM (PiM) techniques using real DRAM chips. The framework is prototyped on a Xilinx ZC706 FPGA, integrated with an open-source RISC-V system (Rocket Chip). Demonstrating techniques like RowClone achieved significant performance improvements, providing up to 14.6X speedup over conventional CPU data copy operations.
Report
Key Highlights
- Novel Framework: PiDRAM is introduced as the first open-source, flexible, end-to-end framework designed specifically for evaluating Processing-in-DRAM (PiM) techniques on real DRAM hardware.
- Real-World Testing: The framework moves PiM evaluation beyond simulation by utilizing real DRAM chips, allowing for accurate system integration studies.
- Performance Gains: Implementations demonstrated substantial acceleration; RowClone achieved up to 14.6X speedup for data copy (
memcpy) and 12.6X speedup for data initialization (calloc) compared to conventional CPU operations. - RISC-V Integration: PiDRAM is implemented on a Xilinx ZC706 FPGA running the open-source Rocket Chip RISC-V system, providing a robust, modular evaluation environment.
- High-Throughput Randomness: The D-RaNGe implementation successfully generated true random numbers at a high throughput of 8.30 Mb/s.
Technical Details
| Component | Specification/Method |
|---|---|
| Framework | PiDRAM (Processing-in-DRAM) |
| Platform | FPGA-based (Xilinx ZC706) |
| Host Processor | Open-source RISC-V core (Rocket Chip) |
| PiM Technique 1 | RowClone (in-DRAM copy/initialization using ComputeDRAM command sequences) |
| PiM Technique 2 | D-RaNGe (in-DRAM True Random Number Generator based on DRAM activation-latency failures) |
| Implementation Effort (LOC) | RowClone: 198 Verilog / 565 C++; D-RaNGe: 190 Verilog / 78 C++ |
| Copy Speedup | Up to 14.6X (RowClone vs. CPU memcpy) |
| Initialization Speedup | Up to 12.6X (RowClone vs. CPU calloc) |
| Random Number Throughput | 8.30 Mb/s (D-RaNGe) |
Implications
- Accelerating PiM Research: PiDRAM provides a crucial bridge between theoretical PiM concepts and practical, real-world systems. By offering an open, flexible platform that utilizes real DRAM chips, it significantly speeds up the validation and refinement cycle for new PiM techniques.
- Strengthening the RISC-V Ecosystem: The choice of the Rocket Chip RISC-V system makes PiDRAM highly relevant to the RISC-V community. It demonstrates how memory acceleration can be integrated directly into open-source CPU designs, potentially leading to future standard extensions or co-processor interfaces specifically tailored for in-memory computation.
- Democratization of Hardware Innovation: The open-source nature of PiDRAM lowers the barrier to entry for researchers and companies looking to experiment with and implement solutions to the memory bottleneck (Memory Wall) without requiring proprietary simulation tools or dedicated custom silicon.
- Enabling Future Systems: By validating techniques that enable fundamental tasks like data movement (RowClone) and security primitives (D-RaNGe) to be executed within the memory controller, PiDRAM paves the way for a new generation of high-performance, energy-efficient computing systems.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.