CVA6S+: A Superscalar RISC-V Core with High-Throughput Memory Architecture

CVA6S+: A Superscalar RISC-V Core with High-Throughput Memory Architecture

Abstract

CVA6S+ is an enhanced, open-source superscalar RISC-V core featuring optimized microarchitectural components like improved branch prediction and register renaming. These enhancements lead to a 43.5% performance increase over the scalar CVA6 core, incurring only a 9.30% area overhead. Furthermore, the integration with the OpenHW Core-V HPDCache yields a significant 74.1% memory bandwidth improvement, positioning CVA6S+ for high-throughput embedded domains like automotive.

Report

CVA6S+: A Superscalar RISC-V Core Analysis

Key Highlights

  • Significant Performance Improvement: CVA6S+ achieves a 43.5% performance uplift (IPC) compared to the original scalar CVA6 configuration.
  • Efficient Iteration: The core demonstrates a 10.9% performance improvement specifically over its immediate superscalar predecessor, CVA6S.
  • Minimal Overhead: The substantial performance gains are achieved with a low area overhead of just 9.30% relative to the scalar CVA6 core.
  • High-Throughput Memory: Integration with the OpenHW Core-V HPDCache results in a 74.1% memory bandwidth improvement over the legacy CVA6 cache subsystem.
  • Target Domain: The core is explicitly designed to maximize Instructions Per Cycle (IPC) for high-end embedded applications, such as automotive.

Technical Details

  • Core Foundation: Built upon the established, open-source CVA6 and CVA6S RISC-V cores.
  • Superscalar Enhancements: The primary optimizations focus on microarchitectural features including:
    • Improved branch prediction.
    • Enhanced register renaming techniques.
    • Optimized operand forwarding paths.
  • Memory Architecture: The core utilizes the OpenHW Core-V High-Performance L1 Dcache (HPDCache) for data access, addressing memory bottlenecks inherent in high-performance computing.
  • Metrics Cited: 43.5% performance gain (vs. CVA6 scalar), 10.9% performance gain (vs. CVA6S), 9.30% area overhead (vs. CVA6 scalar), and 74.1% bandwidth improvement (vs. legacy CVA6 cache).

Implications

  • RISC-V Competitiveness: CVA6S+ validates the ability of the open-source RISC-V ecosystem to develop competitive, high-performance out-of-order cores with superior efficiency (high performance, low area cost), challenging proprietary architectures.
  • Embedded Market Penetration: By addressing the critical need for high IPC and maximizing memory bandwidth, CVA6S+ strengthens RISC-V's position in demanding high-end embedded markets, notably automotive, where reliability and throughput are paramount.
  • Open Source Development Synergy: The successful integration of an external open-source component (OpenHW Core-V HPDCache) highlights effective collaboration, demonstrating how shared infrastructure can rapidly boost core performance, specifically concerning I/O and memory throughput.
  • Optimized Design Path: The incremental 10.9% gain over CVA6S shows that focused microarchitectural refinement, rather than complete redesign, can yield significant performance increases, making the upgrade path viable for existing CVA6 users.
lock-1

Technical Deep Dive Available

This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.

Read Full Report →