FPnew: An Open-Source Multi-Format Floating-Point Unit Architecture for Energy-Proportional Transprecision Computing
Abstract
FPnew is a highly configurable, open-source transprecision floating-point unit (TP-FPU) architecture designed to enable energy-proportional computing by handling a wide range of standard and custom FP formats. The unit extends the RISC-V ISA to incorporate operations on half-precision, bfloat16, 8-bit formats, and SIMD vectors. Integrated into RISC-V cores, FPnew achieves significant performance gains (up to 1.67x speedup) and leading-edge measured energy efficiencies, reaching up to 2.95 Tflop/sW on 8-bit mini-floats.
Report
FPnew: Transprecision FPU Architecture
Key Highlights
- Open-Source Transprecision FPU (TP-FPU): FPnew is presented as a highly configurable, open-source floating-point unit designed specifically for energy-proportional transprecision computing.
- Performance Gain: Integration into a 32-bit RISC-V core resulted in a 1.67x speedup for mixed-precision applications, alongside a 37% reduction in system energy compared to an FP32 baseline.
- Leading-Edge Energy Efficiency: Fabricated silicon in Globalfoundries 22FDX technology demonstrated peak measured efficiencies between 178 Gflop/sW (on FP64) and 2.95 Tflop/sW (on 8-bit mini-floats).
- High Throughput: The 64-bit core integration achieved performance ranging from 3.2 Gflop/s to 25.3 Gflop/s.
Technical Details
- Multi-Format Support: The architecture supports a wide range of precisions, including standard formats (implied FP32, FP64) and transprecision formats like half-precision, bfloat16, and custom 8-bit FP formats.
- ISA Extension: The unit requires extension of the RISC-V Instruction Set Architecture (ISA) to support new operations, specifically for low-precision formats, SIMD vectors, and multi-format operations.
- SIMD Capabilities: The 64-bit RISC-V integration supports five different FP formats across scalar operation modes and 2-way, 4-way, or 8-way SIMD vectors.
- Manufacturing and Testing: The unit was manufactured in Globalfoundries 22FDX technology and tested across a wide voltage range (0.45V to 1.2V), demonstrating robust operation across varying power points.
Implications
- Addressing the Power Wall: FPnew provides a critical open-source architectural solution for architects dealing with the slowdown of Moore's Law and the power wall, enabling energy reduction without sacrificing the necessary end-to-end precision in applications.
- RISC-V Ecosystem Enhancement: By providing a highly efficient, open-source FPU and defining necessary ISA extensions for transprecision, FPnew significantly advances the capability of the RISC-V platform for modern workloads like AI/ML, which rely heavily on mixed-precision computation.
- Accelerating Specialized Computing: The configurability and SIMD capabilities of FPnew make it ideal for integration into domain-specific accelerators (DSAs) and customizable hardware, fostering rapid innovation in hardware architecture.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.