CVA6 RISC-V Virtualization: Architecture, Microarchitecture, and Design Space Exploration
Abstract
This article describes the implementation and optimization of hardware virtualization support for the open-source RISC-V CVA6 core, encompassing architecture and microarchitecture enhancements. The authors introduce specific structures like the G-Stage TLB (GTLB) and L2 TLB to alleviate the significant performance overhead associated with virtualization. The optimal configuration, identified through extensive Design Space Exploration, achieves an average functional performance speedup of 12.5% over non-optimized designs at negligible costs (0.78% area increase and 0.33% power increase).
Report
Structured Report: CVA6 RISC-V Virtualization
Key Highlights
- RISC-V Virtualization Implementation: The work successfully integrates and optimizes hardware virtualization support into the open-source RISC-V CVA6 core.
- Performance Optimization: The implemented microarchitectural enhancements yielded a substantial performance speedup of up to 16% (approximately 12.5% on average) compared to a virtualization-aware but non-optimized design.
- Efficiency: This significant performance gain was achieved at a minimal resource cost, requiring only a 0.78% increase in area and a 0.33% increase in power.
- Design Space Exploration (DSE): A comprehensive DSE was performed using both post-layout simulations (22nm FDX technology) and functional assessment on an FPGA platform (Genesys 2) to select the optimal design point.
- Open Source: All architecture and microarchitecture work described is publicly available, allowing the community to utilize and further iterate on the designs.
Technical Details
| Aspect | Specification/Method |
|---|---|
| Target Core | RISC-V CVA6 Core |
| Virtualization Mechanism | RISC-V hardware virtualization support |
| Key Enhancements | G-Stage Translation Lookaside Buffer (GTLB) and L2 TLB |
| Optimization Goal | Alleviating performance overhead associated with two-stage address translation in virtualization. |
| Technology Node | Post-layout simulations based on 22nm FDX technology |
| Functional Testing | FPGA mapping using the Genesys 2 platform |
| Software Stack | MiBench benchmark running on Linux atop the Bao hypervisor (single-core configuration) |
| Metrics Assessed | Performance, Power, and Area (PPA) |
Implications
This research provides a critical foundation for high-performance RISC-V systems requiring robust virtualization capabilities, a feature increasingly demanded in modern computing.
- Validation of RISC-V Ecosystem: It validates the practical implementation and optimization of the RISC-V virtualization extension, demonstrating that efficient hardware-assisted virtualization is feasible on open cores.
- Reference Implementation: The CVA6 implementation serves as a crucial open-source reference design for other commercial and academic RISC-V projects looking to incorporate virtualization efficiently.
- Resource Efficiency for Embedded Systems: By achieving large performance gains with extremely low power and area overheads, the CVA6 virtualization implementation becomes highly suitable for embedded systems, edge computing, and automotive applications where resource constraints are severe.
- Performance Benchmarking: The quantified results, particularly the effectiveness of structures like the GTLB and L2 TLB in addressing translation latency, inform future microarchitectural development for RISC-V processors.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.