HEROv2: Full-Stack Open-Source Research Platform for Heterogeneous Computing
Abstract
HEROv2 is an FPGA-based, full-stack, open-source research platform designed to enable fast and accurate exploration of heterogeneous computing architectures, circumventing the compromises of traditional simulators. It integrates application-class 64-bit hosts (ARMv8 or RV64) with clusters of 32-bit RISC-V accelerators, supporting mixed Instruction Set Architectures (ISA) and data models. The platform features an advanced LLVM-based compiler that achieves significant speedups—up to 4.4x—by automating complex tasks like loop tiling and data transfer inference.
Report
HEROv2: Full-Stack Open-Source Research Platform for Heterogeneous Computing
Key Highlights
- Research Platform: HEROv2 is an FPGA-based, full-stack, open-source platform designed for the accurate and fast exploration of heterogeneous computers, serving as an alternative to slower simulators.
- Mixed Architecture: The platform combines application-class 64-bit host processors (ARMv8 or RV64) with domain-specific accelerators built on clusters of 32-bit RISC-V cores.
- Open-Source Stack: It provides a fully open-source environment including the on-chip network, a unified heterogeneous programming interface, and a compiler toolchain.
- Performance Results: Compiler optimizations (tiling, data transfer inference) lead to speedups up to 4.4x compared to the original programs, and automated code generation is only about 15% slower than highly complex, handwritten implementations (which required 2.6x more code).
Technical Details
- Platform Technology: Utilizes an FPGA implementation to ensure high performance modeling accuracy, moving beyond simulation compromises.
- ISA Heterogeneity: Specifically designed to handle the complexity of mixed-ISA systems, bridging 64-bit hosts and 32-bit accelerators, including RV64 (host) and RV32 (accelerator) configurations.
- Interconnect and Data Management: Features a fully open-source on-chip network and mechanisms to seamlessly share data between the 64-bit host and 32-bit accelerator domains.
- Compiler Toolchain: Uses a mixed-data-model, mixed-ISA heterogeneous compiler based on the LLVM framework.
- Compiler Capabilities: The compiler can automatically tile loops and infer necessary data transfers between the host and accelerator, simplifying programmer effort significantly.
Implications
- Advancing RISC-V Heterogeneity: HEROv2 provides a vital, practical testbed for developing and validating highly complex heterogeneous systems that leverage RISC-V, particularly in demonstrating effective integration between 32-bit efficiency cores and 64-bit application processors.
- Accelerated Research: By offering an FPGA platform, it dramatically speeds up the research and development cycle compared to slow architectural simulators, enabling faster validation of new hardware architectures and system software approaches.
- Full-Stack Co-Design: The platform is crucial for full-stack research, allowing simultaneous exploration of hardware microarchitecture, system architecture, programming models, and toolchain development.
- Reduced Programming Complexity: The success of the LLVM-based compiler in achieving near-optimal performance automatically validates methods for reducing the complexity of writing software for heterogeneous systems, making specialized hardware more accessible to general programmers.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.