Scalable and RISC-V Programmable Near-Memory Computing Architectures for Edge Nodes
Abstract
This paper presents a novel, software-friendly Near-Memory Computing (NMC) approach, featuring two scalable RISC-V programmable architectures, NM-Caesar and NM-Carus, designed to enhance energy efficiency in AI-driven edge nodes. These architectures offer low integration effort and general-purpose usability, overcoming the limitations of previous Compute-In-Memory (CIM) solutions lacking software flexibility. Post-layout simulations demonstrate substantial performance improvements over a state-of-the-art RISC-V CPU, achieving up to 53.9x lower execution time and a peak energy efficiency of 306.7 GOPS/W.
Report
Key Highlights
- Energy Efficiency Breakthrough: Proposes a novel Near-Memory Computing (NMC) architecture to address the severe energy constraints imposed by data-centric AI/ML workloads on traditional von Neumann architectures in edge computing.
- Software-Friendly Design: Focuses on creating a general-purpose NMC solution that requires low implementation effort and offers robust software integration, a major improvement over existing, difficult-to-program CIM solutions.
- Architectural Variants: Introduces two specific architectures, NM-Caesar and NM-Carus, which target different trade-offs in area, performance, and flexibility for diverse embedded microcontroller applications.
- Performance Metrics: Achieves substantial gains over a state-of-the-art RISC-V CPU (RV32IMC), showing up to 53.9x faster execution time and 35.6x higher system-level energy efficiency.
- Peak Efficiency: The NM-Carus variant achieves a peak energy efficiency of 306.7 GOPS/W (Giga Operations Per Watt) in 8-bit matrix multiplications, setting a new competitive standard for near-memory circuits.
Technical Details
- Core Concept: Implementation of the Compute-In-Memory (CIM) paradigm through a Near-Memory Computing (NMC) approach that emphasizes programmability.
- Target Domain: Embedded microcontrollers and next-generation edge computing nodes requiring high energy efficiency for ML/AI tasks.
- Architectures: NM-Caesar and NM-Carus, designed to cover a wide spectrum of embedded needs regarding area and performance scaling.
- Programmability: The architectures are designed to be RISC-V programmable, facilitating easier software development and integration into existing ecosystems.
- Comparison Baseline: Performance results are calculated via post-layout simulations and compared against executing the same tasks on a state-of-the-art RV32IMC RISC-V CPU core.
- Quantitative Results: System-level improvements include up to 28.0x and 53.9x lower execution time, and 25.0x and 35.6x higher energy efficiency.
- Specific Task Efficiency: NM-Carus's peak efficiency of 306.7 GOPS/W is recorded during 8-bit matrix multiplication operations.
Implications
- Advancing Edge AI: This work provides a crucial hardware foundation for deploying complex, data-intensive AI models directly on resource-constrained edge devices with minimal power overhead.
- RISC-V Ecosystem Acceleration: By coupling high-performance NMC with RISC-V programmability, the architectures offer a scalable and accessible path for hardware acceleration within the RISC-V ISA, strengthening its competitive position against proprietary architectures in the AI space.
- Feasibility of CIM: The emphasis on 'low-integration-effort' and 'software-friendly' design removes major practical roadblocks, significantly lowering the barrier to entry for companies adopting Compute-In-Memory or Near-Memory concepts.
- Architectural Shift: Confirms the necessity and feasibility of moving computing closer to memory to break the von Neumann bottleneck, validating NMC as a superior candidate architecture for future high-performance, low-power embedded systems.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.