Scale up your In-Memory Accelerator: Leveraging Wireless-on-Chip Communication for AIMC-based CNN Inference
Abstract
Analog In-Memory Computing (AIMC) scalability is limited by the massive bandwidth and low latency required for on-chip communication, which traditional infrastructure cannot support. This work introduces a many-tile AIMC architecture that resolves this bottleneck by integrating high-performance inter-tile Wireless-on-Chip (WoC) communication. The proposed heterogeneous system mixes parallel RISC-V cores and AIMC tiles, leveraging millimeter-wave and terahertz transceivers to sustain high throughput necessary for scalable CNN inference.
Report
Key Highlights
- AIMC (Analog In-Memory Computing) offers superior peak performance for Matrix-Vector multiplication but is bottlenecked by traditional wired on-chip communication when scaling up.
- The primary innovation is leveraging emerging Wireless-on-Chip (WoC) communication to overcome the bandwidth and latency limitations inherent in large AIMC tile arrays.
- The proposed system features a many-tile heterogeneous architecture that uses wireless transceivers to supply data at high bandwidth to the accelerator units.
- The architecture integrates both parallel RISC-V cores and specialized AIMC tiles, requiring extensive design space exploration (DSE) to optimize the wireless infrastructure benefits.
Technical Details
- Processing Paradigm: Analog In-Memory Computing (AIMC) is used for efficient Matrix-Vector multiplication, targeting CNN inference acceleration.
- Communication Technology: Wireless-on-Chip (WoC) using high-frequency bands, specifically millimeter-wave (mmWave) and terahertz (THz) bands, to provide the necessary high bandwidth and low latency for inter-tile data transfer.
- Architecture Type: Heterogeneous, many-tile system integrating multiple computing clusters.
- Components: The architecture comprises specialized AIMC tiles for acceleration and parallel RISC-V cores for general-purpose computing and orchestration.
Implications
- Scalability of AI Acceleration: This approach demonstrates a viable pathway for manufacturing large-scale, high-throughput AIMC accelerators by solving the crucial on-chip communication bottleneck, making AIMC practical for demanding real-world applications.
- RISC-V Ecosystem Integration: The architecture showcases RISC-V cores acting as essential control or general-purpose compute units seamlessly integrated with disruptive analog hardware (AIMC), validating RISC-V's role in future heterogeneous computing platforms.
- New Interconnect Standard: The successful application of WoC in the mmWave/THz bands establishes it as a necessary, disruptive paradigm shift for designing future chips that integrate numerous high-performance, high-bandwidth accelerators.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.