FPnew: An Open-Source Multi-Format Floating-Point Unit Architecture for Energy-Proportional Transprecision Computing

FPnew: An Open-Source Multi-Format Floating-Point Unit Architecture for Energy-Proportional Transprecision Computing

Abstract

FPnew is a highly configurable, open-source transprecision floating-point unit (TP-FPU) architecture designed to enable energy-proportional computing by handling a wide range of standard and custom FP formats. The unit extends the RISC-V ISA to incorporate operations on half-precision, bfloat16, 8-bit formats, and SIMD vectors. Integrated into RISC-V cores, FPnew achieves significant performance gains (up to 1.67x speedup) and leading-edge measured energy efficiencies, reaching up to 2.95 Tflop/sW on 8-bit mini-floats.

Report

FPnew: Transprecision FPU Architecture

Key Highlights

  • Open-Source Transprecision FPU (TP-FPU): FPnew is presented as a highly configurable, open-source floating-point unit designed specifically for energy-proportional transprecision computing.
  • Performance Gain: Integration into a 32-bit RISC-V core resulted in a 1.67x speedup for mixed-precision applications, alongside a 37% reduction in system energy compared to an FP32 baseline.
  • Leading-Edge Energy Efficiency: Fabricated silicon in Globalfoundries 22FDX technology demonstrated peak measured efficiencies between 178 Gflop/sW (on FP64) and 2.95 Tflop/sW (on 8-bit mini-floats).
  • High Throughput: The 64-bit core integration achieved performance ranging from 3.2 Gflop/s to 25.3 Gflop/s.

Technical Details

  • Multi-Format Support: The architecture supports a wide range of precisions, including standard formats (implied FP32, FP64) and transprecision formats like half-precision, bfloat16, and custom 8-bit FP formats.
  • ISA Extension: The unit requires extension of the RISC-V Instruction Set Architecture (ISA) to support new operations, specifically for low-precision formats, SIMD vectors, and multi-format operations.
  • SIMD Capabilities: The 64-bit RISC-V integration supports five different FP formats across scalar operation modes and 2-way, 4-way, or 8-way SIMD vectors.
  • Manufacturing and Testing: The unit was manufactured in Globalfoundries 22FDX technology and tested across a wide voltage range (0.45V to 1.2V), demonstrating robust operation across varying power points.

Implications

  • Addressing the Power Wall: FPnew provides a critical open-source architectural solution for architects dealing with the slowdown of Moore's Law and the power wall, enabling energy reduction without sacrificing the necessary end-to-end precision in applications.
  • RISC-V Ecosystem Enhancement: By providing a highly efficient, open-source FPU and defining necessary ISA extensions for transprecision, FPnew significantly advances the capability of the RISC-V platform for modern workloads like AI/ML, which rely heavily on mixed-precision computation.
  • Accelerating Specialized Computing: The configurability and SIMD capabilities of FPnew make it ideal for integration into domain-specific accelerators (DSAs) and customizable hardware, fostering rapid innovation in hardware architecture.
lock-1

Technical Deep Dive Available

This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.

Read Full Report →