SpikeStream: Accelerating Spiking Neural Network Inference on RISC-V Clusters with Sparse Computation Extensions
Abstract
This paper presents SpikeStream, an optimization technique designed to accelerate Spiking Neural Network (SNN) inference on general-purpose RISC-V multicore clusters using a low-overhead ISA extension for sparse computation streaming. SpikeStream maps weight accesses to specialized register-mapped memory streams, successfully overcoming the utilization challenges posed by SNN event sparsity. Experiments using the Spiking-VGG11 model show a significant 4.39x speedup and achieve competitive energy efficiency, outperforming specialized neuromorphic hardware like Loihi and LSMCore.
Report
Key Highlights
- General-Purpose Acceleration: The research focuses on accelerating SNN inference on standard, general-purpose multicore RISC-V clusters, addressing the high silicon cost and lack of flexibility associated with dedicated neuromorphic processors.
- SpikeStream Optimization: Introduces SpikeStream, a low-level software design and parallelization technique for efficiently handling sparse SNN events.
- Performance Metrics: Achieves a 4.39x speedup compared to a non-streaming parallel baseline.
- Efficiency Gains: Demonstrates a 3.46x energy efficiency gain over LSMCore and a 2.38x performance gain over the specialized neuromorphic chip Loihi.
- Utilization Improvement: Increases cluster utilization significantly, moving from a baseline of 9.28% up to 52.3%.
Technical Details
- Target Architecture: General-purpose multicore RISC-V clusters.
- Methodology: Utilizes a low-overhead RISC-V Instruction Set Architecture (ISA) extension specifically designed for streaming sparse computations.
- SpikeStream Mechanism: The optimization technique involves mapping weight accesses to two types of memory streams: affine streams and indirect register-mapped memory streams.
- Evaluation Model: The approach was validated using the end-to-end Spiking-VGG11 model inference.
Implications
- Democratizing SNNs: SpikeStream demonstrates that competitive SNN acceleration can be achieved on flexible, general-purpose RISC-V hardware, potentially lowering the barrier to entry for SNN deployment compared to relying solely on expensive, specialized ASICs.
- RISC-V Ecosystem Expansion: The work validates the utility of sparse computation ISA extensions within the RISC-V framework, encouraging further development of domain-specific extensions that enhance AI and machine learning capabilities.
- Energy Efficiency in Edge AI: By tackling the sparsity challenge efficiently on standard platforms, this method advances the viability of high-efficiency, event-driven AI (SNNs) for resource-constrained edge and embedded systems built on RISC-V technology.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.