ARCANE: Adaptive RISC-V Cache Architecture for Near-memory Extensions
Abstract
ARCANE introduces a novel Adaptive RISC-V Cache Architecture that transforms the traditional cache subsystem into a tightly-coupled compute-near-memory coprocessor, specifically addressing the von Neumann data movement bottleneck. This architecture allows the RISC-V cache controller to execute custom instructions from the host CPU by dispatching vector operations to near-memory vector processing units. The proposed design achieves significant acceleration, showing $30\times$ to $84\times$ performance improvement on an 8-bit CNN workload compared to a traditional cached system, while incurring only a 41.3% area overhead.
Report
Key Highlights
- Core Innovation: ARCANE (Adaptive RISC-V Cache Architecture for Near-memory Extensions) proposes a cache architecture that doubles as a tightly-coupled compute-near-memory (CnM) coprocessor.
- Target: Directly mitigates the von Neumann bottleneck, characterized by extensive data movement, low throughput, and poor energy efficiency in data-driven applications.
- Performance Gain: Achieves a substantial $30\times$ to $84\times$ performance improvement when operating on 8-bit data compared to a traditional cached system, demonstrated using a worst-case 32-bit Convolutional Neural Network (CNN) workload.
- Area Cost: The implementation features a modest area overhead of only 41.3%.
- Usability: Abstracted memory synchronization and data mapping requirements, improving usability compared to existing in- or near-memory solutions.
Technical Details
- Architecture Type: Adaptive RISC-V Cache Architecture featuring near-memory computing capabilities.
- Processing Units: Vector operations are dispatched to specialized Near-Memory Vector Processing Units (VPUs) embedded within the cache memory subsystem.
- Control Mechanism: The RISC-V cache controller is responsible for executing custom instructions received from the host CPU.
- Extensibility: The architecture supports software-based Instruction Set Architecture (ISA) extensibility, allowing applications to define and utilize custom compute instructions directly within the cache.
- Software Abstraction: By handling data synchronization and mapping internally, the architecture reduces complexity for application software, which is a major challenge for many compute-in-memory solutions.
Implications
- Future of RISC-V: ARCANE provides a high-performance blueprint for integrating computation directly into RISC-V memory hierarchies, enhancing the capability of RISC-V cores in accelerating AI and data-intensive workloads.
- Data-Centric Computing: This solution offers a practical path toward solving the memory wall problem for modern data-driven applications, making high throughput and energy efficiency attainable without discarding the fundamental flexibility of the CPU.
- Usability in CnM: By abstracting complex data placement and synchronization requirements, ARCANE addresses a major adoption barrier for near-memory computing, potentially accelerating the widespread use of CnM architectures in commercial products.
- Efficiency: Achieving massive performance gains ($30\times$ to $84\times$) with a relatively low area overhead (41.3%) makes this architecture economically feasible for integration into system-on-chip (SoC) designs.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.