Scalable and RISC-V Programmable Near-Memory Computing Architectures for Edge Nodes

Scalable and RISC-V Programmable Near-Memory Computing Architectures for Edge Nodes

Abstract

This paper presents a novel, software-friendly Near-Memory Computing (NMC) approach, featuring two scalable RISC-V programmable architectures, NM-Caesar and NM-Carus, designed to enhance energy efficiency in AI-driven edge nodes. These architectures offer low integration effort and general-purpose usability, overcoming the limitations of previous Compute-In-Memory (CIM) solutions lacking software flexibility. Post-layout simulations demonstrate substantial performance improvements over a state-of-the-art RISC-V CPU, achieving up to 53.9x lower execution time and a peak energy efficiency of 306.7 GOPS/W.

Report

Key Highlights

  • Energy Efficiency Breakthrough: Proposes a novel Near-Memory Computing (NMC) architecture to address the severe energy constraints imposed by data-centric AI/ML workloads on traditional von Neumann architectures in edge computing.
  • Software-Friendly Design: Focuses on creating a general-purpose NMC solution that requires low implementation effort and offers robust software integration, a major improvement over existing, difficult-to-program CIM solutions.
  • Architectural Variants: Introduces two specific architectures, NM-Caesar and NM-Carus, which target different trade-offs in area, performance, and flexibility for diverse embedded microcontroller applications.
  • Performance Metrics: Achieves substantial gains over a state-of-the-art RISC-V CPU (RV32IMC), showing up to 53.9x faster execution time and 35.6x higher system-level energy efficiency.
  • Peak Efficiency: The NM-Carus variant achieves a peak energy efficiency of 306.7 GOPS/W (Giga Operations Per Watt) in 8-bit matrix multiplications, setting a new competitive standard for near-memory circuits.

Technical Details

  • Core Concept: Implementation of the Compute-In-Memory (CIM) paradigm through a Near-Memory Computing (NMC) approach that emphasizes programmability.
  • Target Domain: Embedded microcontrollers and next-generation edge computing nodes requiring high energy efficiency for ML/AI tasks.
  • Architectures: NM-Caesar and NM-Carus, designed to cover a wide spectrum of embedded needs regarding area and performance scaling.
  • Programmability: The architectures are designed to be RISC-V programmable, facilitating easier software development and integration into existing ecosystems.
  • Comparison Baseline: Performance results are calculated via post-layout simulations and compared against executing the same tasks on a state-of-the-art RV32IMC RISC-V CPU core.
  • Quantitative Results: System-level improvements include up to 28.0x and 53.9x lower execution time, and 25.0x and 35.6x higher energy efficiency.
  • Specific Task Efficiency: NM-Carus's peak efficiency of 306.7 GOPS/W is recorded during 8-bit matrix multiplication operations.

Implications

  • Advancing Edge AI: This work provides a crucial hardware foundation for deploying complex, data-intensive AI models directly on resource-constrained edge devices with minimal power overhead.
  • RISC-V Ecosystem Acceleration: By coupling high-performance NMC with RISC-V programmability, the architectures offer a scalable and accessible path for hardware acceleration within the RISC-V ISA, strengthening its competitive position against proprietary architectures in the AI space.
  • Feasibility of CIM: The emphasis on 'low-integration-effort' and 'software-friendly' design removes major practical roadblocks, significantly lowering the barrier to entry for companies adopting Compute-In-Memory or Near-Memory concepts.
  • Architectural Shift: Confirms the necessity and feasibility of moving computing closer to memory to break the von Neumann bottleneck, validating NMC as a superior candidate architecture for future high-performance, low-power embedded systems.
lock-1

Technical Deep Dive Available

This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.

Read Full Report →