Enabling High Performance RISC-V Software for AI in the Real World

Enabling High Performance RISC-V Software for AI in the Real World

Abstract

Embecosm demonstrated the successful acceleration of PyTorch AI workloads on RISC-V cores using the oneAPI Construction Kit. This effort, presented at RISC-V Europe 2025, showcased the portability of the unified oneAPI ecosystem, which leverages open standards like C++ and Khronos SYCL, onto new hardware architectures. The case study confirmed the feasibility of achieving high-performance AI on RISC-V by exploring thousands of configurations through emulation and FPGA testing.

Report

Key Highlights

  • AI Acceleration on RISC-V: The project successfully demonstrated methods for accelerating PyTorch software using RISC-V cores.
  • oneAPI Portability: The core innovation involves utilizing the oneAPI Construction Kit to port the oneAPI ecosystem—a unified environment for expressing parallelism—to the RISC-V architecture.
  • Extensive Testing: Over a thousand RISC-V core configurations were explored and tested using emulation, with further validation performed on FPGA platforms.
  • Public Presentation: The findings were presented as a poster at the RISC-V Europe 2025 conference.

Technical Details

  • Toolchain: The oneAPI Construction Kit (OAC) was the primary tool used to adapt the oneAPI standards to the new hardware target.
  • Target Software: The focus application was PyTorch, a leading machine learning framework.
  • Standards/Languages: The oneAPI ecosystem relies on established open standards, specifically C++ and Khronos SYCL, which are crucial for expressing parallelism.
  • Hardware Exploration: Testing was conducted against various RISC-V cores, exploring performance across a wide range of configurations (1000+ variants) via emulation and proof-of-concept testing on FPGAs.
  • Contributors: The work was carried out by Embecosm, building upon technology developed by Codeplay Software Ltd (whose architect, Alastair Murray, specializes in SYCL language runtimes).

Implications

  • Maturation of RISC-V Ecosystem: This work significantly enhances the software readiness of RISC-V, validating it as a viable, high-performance platform for demanding AI and machine learning applications.
  • Heterogeneous Computing Support: Successfully integrating the oneAPI/SYCL framework proves that RISC-V can efficiently utilize heterogeneous processing elements (accelerators and GPUs), a necessity for modern AI.
  • Developer Adoption: By integrating major toolchains like oneAPI, RISC-V lowers the barrier to entry for developers already familiar with C++ and parallel programming standards, encouraging broader adoption and accelerating software ecosystem growth.
  • Open Standard Commitment: Relying on open standards like SYCL and C++ reduces proprietary vendor lock-in, which is highly valuable for the open nature of the RISC-V instruction set architecture (ISA).
lock-1

Technical Deep Dive Available

This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.

Read Full Report →