Ramping Up Open-Source RISC-V Cores: Assessing the Energy Efficiency of Superscalar, Out-of-Order Execution
Abstract
This work standardizes the proprietary XuanTie C910 Out-of-Order (OoO) RISC-V core for full compliance and introduces CVA6S+, an enhanced dual-issue version of the CVA6 core, achieving a 34.4% performance boost. A detailed analysis across the C910, CVA6S+, and CVA6 cores, implemented in 22nm technology, assessed performance, area, and efficiency. Contrary to conventional expectations, the high-performance superscalar OoO C910 proved highly competitive in energy efficiency (GOPS/W), challenging the trade-off typically associated with high-performance architectures.
Report
Key Highlights
- Standardization of C910: The superscalar Out-of-Order (OoO) XuanTie C910 core was modified to achieve full RISC-V standard compliance across its debug, interrupt, and memory interfaces, resolving issues related to proprietary protocols (like non-standard AXI extensions).
- Introduction of CVA6S+: An enhanced, dual-issue superscalar in-order version of the open-source CVA6 core was developed, named CVA6S+, demonstrating a 34.4% performance improvement (IPC) over the vanilla CVA6.
- Energy Efficiency Finding: The study reveals that the high-performance C910 core is highly competitive in energy efficiency (GOPS/W), despite its complexity, challenging the assumption that OoO execution inherently sacrifices efficiency.
- Area Efficiency: The CVA6S+ core was found to lead in area efficiency (GOPS/mm²).
- Performance Scaling: Compared to the scalar CVA6, the C910 achieved a 119.5% improvement in IPC while requiring a 75% increase in area.
Technical Details
- Cores Compared: The analysis included three distinct microarchitectures under the same Implementation and ISA:
- C910 (Modified): Superscalar, Out-of-Order (OoO).
- CVA6S+: Superscalar, In-Order, Dual-Issue.
- CVA6 (Vanilla): Single-Issue, In-Order.
- Development and Implementation: All cores were implemented using identical technology, tools, and methodologies.
- Technology Node: 22nm technology.
- Integration Platform: The open-source modular System-on-Chip (SoC) known as Cheshire.
- C910 Standardization: The primary modifications focused on ensuring compatibility with industrial electronic design automation (EDA) tools and RISC-V standards by replacing proprietary AXI protocol extensions, interrupts, and debug protocols.
- Metrics: Performance (IPC), Area (mm²), Power (W), and resulting efficiencies (GOPS/W and GOPS/mm²) were measured.
Implications
- Accelerating Industrial Adoption: By standardizing the highly performant C910 core, the research removes major hurdles (proprietary interfaces, limited EDA support) currently limiting the adoption of high-IPC open-source RISC-V solutions in demanding domains like automotive and space.
- Revisiting Efficiency Trade-offs: The finding that the OoO C910 is highly energy efficient fundamentally challenges the conventional wisdom that maximum performance necessarily entails significant energy penalties. This validates the feasibility of designing aggressive, yet efficient, open-source OoO RISC-V cores.
- Providing Benchmarks for Architects: Implementing diverse microarchitectures (single-issue, dual-issue superscalar in-order, and OoO superscalar) using the same fabrication technology (22nm) and SoC platform provides valuable, directly comparable data for designers choosing between area, power, and performance goals.
- Enhanced Open-Source Ecosystem: The introduction of the fully compliant C910 and the optimized CVA6S+ adds two high-quality, high-performance, and industry-ready cores to the public RISC-V hardware repository.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.