Work-In-Progress: Accelerating Numpy With OpenBLAS For Open-Source RISC-V Chips
Abstract
This preliminary work presents a methodology for accelerating high-level Python applications, specifically those utilizing the Numpy library, on heterogeneous RISC-V Systems-on-Chip (SoCs). The approach involves modifying the OpenBLAS library to utilize OpenMP for offloading selected linear algebra kernels to a programmable manycore accelerator (PMCA). By linking Numpy against this modified library, the researchers successfully demonstrated the acceleration of operators like matrix multiplication on an open-source RISC-V platform emulated on an FPGA.
Report
Key Highlights
- Numpy applications are accelerated by linking them against a custom version of the OpenBLAS library.
- The primary innovation is modifying OpenBLAS to offload selected linear algebra kernels (e.g., matrix multiplication) to dedicated hardware.
- The target platform is an open-source heterogeneous RISC-V System-on-Chip (SoC).
- The acceleration mechanism utilizes OpenMP directives to manage kernel execution on the accelerator.
Technical Details
- Software Stack: The Python package Numpy is linked against the modified OpenBLAS library.
- Offloading Mechanism: OpenMP is used to manage the data and kernel transfer to the accelerator.
- Target Architecture: A heterogeneous SoC is utilized, featuring two distinct cores:
- Host: rv64g architecture, capable of running Linux.
- Accelerator (PMCA): rv32imafd architecture (Programmable Manycore Accelerator).
- Implementation: The entire heterogeneous platform is implemented and evaluated using FPGA emulation.
Implications
- Bridging Software/Hardware Gap: This work significantly simplifies the process of leveraging RISC-V hardware heterogeneity, allowing high-level scientific applications (Python/Numpy) to automatically benefit from specialized hardware acceleration without deep manual modification.
- Enhancing RISC-V Capabilities: By accelerating crucial linear algebra operations (BLAS), this effort substantially improves the performance viability of open-source RISC-V chips for workloads in scientific computing, machine learning, and data processing.
- Fostering Open-Source Ecosystem: The focus on open-source hardware (RISC-V) and software (Numpy, OpenBLAS) promotes the development of a fully transparent and customizable high-performance computing environment.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.