Ocelot3: Full Vector “V” Extension for BOOM

Ocelot3: Full Vector “V” Extension for BOOM

Abstract

Ocelot3 is the latest iteration of the open-source project integrating vector support into the BOOM RISC-V core, achieving full compatibility with the RVV 1.0 specification. This generation features a decoupled Vector Processing Unit (VPU) connected via the Open Vector Interface, fostering community collaboration. A primary innovation over Ocelot2 is the successful implementation of complex segmented vector memory access instructions, requiring sophisticated data transposition techniques.

Report

Ocelot3: Full Vector “V” Extension for BOOM

Key Highlights

  • Full RVV 1.0 Support: Ocelot3 achieves complete compliance with the RISC-V Vector extension (RVV) version 1.0 standard.
  • BOOM Core Integration: The project successfully adds vector capabilities to the open-source, high-performance BOOM (Berkeley Out-of-Order Machine) core.
  • Decoupled Architecture: The design utilizes a decoupled Vector Processing Unit (VPU).
  • Segmented Memory Access: A major update over Ocelot2 is the support for complex segmented vector memory access instructions.
  • Open Interface: The VPU connects through the Open Vector Interface, promoting modularity and community development.

Technical Details

  • Target Core: BOOM (Berkeley Out-of-Order Machine).
  • Vector Standard: RVV 1.0 (Full support).
  • VPU Connection: Utilizes the Open Vector Interface (OVI).
  • Implementation Challenge: The implementation of segmented vector memory access instructions was challenging due to the necessary step of transposing the data during access.
  • Affiliations: Developed by Kishore Senthil Kumar and Kuan-Yu Chen, associated with Tenstorrent and the University of Michigan.

Implications

  • Accelerated Open-Source Performance: By providing full RVV 1.0 compliance on the robust BOOM core, Ocelot3 significantly boosts the performance potential of open-source RISC-V hardware, especially for data-parallel workloads like AI/ML and scientific computing.
  • Validation of RVV Standard: The successful, open-source implementation of all features, including challenging components like segmented loads/stores, helps validate the maturity and feasibility of the RVV 1.0 specification.
  • Standardized Modularity: The use of the Open Vector Interface encourages broader ecosystem participation by standardizing how external vector units integrate with base RISC-V cores, fostering greater choice and innovation in VPU designs.
  • Handling Complex Data: The inclusion of segmented memory access allows the processor to efficiently handle non-contiguous or structured data layouts, which is critical for real-world application performance.
lock-1

Technical Deep Dive Available

This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.

Read Full Report →