Research

SPEED: A Scalable RISC-V Vector Processor Enabling Efficient Multi-Precision DNN Inference

Admin

0 views • a year ago (Updated) • 2 min read •

•

Abstract

SPEED is a scalable RISC-V Vector (RVV) processor designed to enable highly efficient multi-precision Deep Neural Network (MP-DNN) inference on resource-constrained edge platforms. It introduces dedicated customized RVV instructions and a parameterized multi-precision tensor unit supporting precision from 4-bit to 16-bit with minimized hardware overhead. Experimental results show SPEED achieves a peak throughput of 737.9 GOPS for 4-bit operations and exhibits superior area efficiency compared to prior RVV processors.

Report

Key Highlights

Target: SPEED is a scalable RISC-V Vector (RVV) processor optimized specifically for efficient Multi-Precision DNN (MP-DNN) inference on edge platforms.
Performance Metrics (4-bit): Achieves a peak throughput of 737.9 GOPS and high energy efficiency of 1383.4 GOPS/W for 4-bit operators.
Area Efficiency: Demonstrates significantly superior area efficiency compared to previous RVV processors, showing enhancements of 5.9x to 26.9x for 8-bit operations and 8.2x to 18.5x for best integer performance.
Precision Range: Supports computation precision ranging widely from 4-bit to 16-bit with minimal hardware increase.

Technical Details

Instruction Set Innovation: Dedicated customized RISC-V instructions are introduced, built upon RVV extensions, specifically designed to reduce instruction complexity and support multi-precision processing (4-bit to 16-bit).
Parallelism Enhancement: A parameterized multi-precision tensor unit is developed and integrated into the scalable module. This unit provides reconfigurable parallelism to optimally match the varying computation patterns required by diverse MP-DNNs.
Dataflow Optimization: A flexible mixed dataflow method is adopted to dynamically improve both computational and energy efficiency based on the specific computing patterns of different DNN operators.
Implementation: The processor was synthesized using TSMC 28nm technology.

Implications

RISC-V Acceleration: SPEED significantly enhances the RISC-V ecosystem's capability in the specialized field of AI acceleration, validating the extensibility of the instruction set architecture (ISA) for domain-specific tasks.
Edge AI Deployment: By efficiently tackling the complexity of multi-precision quantization, SPEED makes deploying advanced, highly-quantized DNNs viable even on severely resource-constrained edge devices.
Competitive Advantage: The demonstrated superior area and energy efficiency positions RISC-V-based processors, such as SPEED, as strong competitive alternatives to proprietary architectures in the high-growth market of energy-efficient AI inference.

Technical Deep Dive Available

This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.

Read Full Report →