Arrow: A RISC-V Vector Accelerator for Machine Learning Inference
Abstract
This paper introduces Arrow, a configurable hardware accelerator architecture implementing a subset of the RISC-V v0.9 vector ISA extension, specifically targeting edge machine learning inference. Benchmarked against fundamental vector and matrix operations, Arrow demonstrates significant performance improvements over a scalar RISC processor. When implemented on a Xilinx FPGA, the Arrow co-processor achieves speedups between 2x and 78x while simultaneously consuming 20% to 99% less energy.
Report
Key Highlights
- Novel Architecture: Introduction of "Arrow," a configurable hardware accelerator for computational efficiency.
- RISC-V Compliance: Arrow implements a critical subset of the RISC-V v0.9 vector ISA extension.
- Target Application: Focused on accelerating edge machine learning (ML) inference tasks.
- Performance Gain: Achieves substantial acceleration, ranging from 2x to 78x faster execution compared to a scalar RISC processor.
- Energy Efficiency: Delivers significant energy savings, consuming 20% to 99% less energy during execution.
Technical Details
- Accelerator Name: Arrow.
- ISA Basis: RISC-V v0.9 vector ISA extension (subset).
- Core Workloads: Execution of vector and matrix benchmarks that are fundamental to ML inference.
- Implementation Platform: Validation and testing were performed using a Xilinx XC7A200T-1SBG484C FPGA.
- Design Goal: The architecture is designed to be configurable, allowing optimization for specific edge ML requirements.
Implications
- Validation of RISC-V Vector ISA: The successful implementation and high-performance results of Arrow validate the effectiveness of the RISC-V vector extension (v0.9) as a basis for hardware acceleration, particularly for computationally intensive tasks like ML.
- Edge AI Enablement: Arrow provides a proven, highly efficient, and energy-conscious solution for deploying ML inference models directly on resource-constrained edge devices.
- Open Hardware Ecosystem Growth: By demonstrating a high-performance vector accelerator built upon the open RISC-V standard, this work encourages further development and adoption of specialized hardware within the RISC-V ecosystem, driving innovation in custom computing solutions.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.