Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip

Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip

Abstract

The Epiphany-V project details the design of a highly parallel System-On-Chip featuring 1024 64-bit RISC processors fabricated using 16nm FinFET technology. This massively multi-core architecture integrates 64MB of on-chip SRAM and utilizes three 136-bit wide mesh Networks-On-Chip for efficient core communication. Crucially, the chip has been taped out and is currently being manufactured by TSMC, showcasing a significant engineering milestone in high-density parallel computing, supported by DARPA funding.

Report

Key Highlights

  • Unprecedented Core Density: Features 1024 individual 64-bit RISC processors on a single System-On-Chip (SoC).
  • Advanced Fabrication: Utilizes 16nm FinFET technology, representing a state-of-the-art node at the time of submission.
  • Manufacturing Status: The chip has been successfully taped out and is actively being manufactured by TSMC.
  • Funding Source: Development was supported by the Defense Advanced Research Projects Agency (DARPA).

Technical Details

  • Processor Architecture: Array of 1024 64-bit RISC processors.
  • Memory: Integrated 64MB of on-chip Static Random-Access Memory (SRAM).
  • Interconnect: Three separate 136-bit wide mesh Networks-On-Chip (NoCs) are used for internal communication.
  • I/O Capabilities: Features 1024 programmable IO pins.
  • Process Node: 16nm FinFET technology.

Implications

  • Extreme Parallelism Demonstration: Epiphany-V serves as a critical proof-of-concept for achieving unprecedented levels of core density and parallelism on a commercial foundry process, validating highly scalable tiled architectures.
  • Advancing RISC Architectures: The design leverages 64-bit RISC cores, demonstrating the viability of lightweight, scalable open-instruction-set architectures for extreme parallelism, a key trend embraced by the broader RISC-V community.
  • Targeting High-Performance Embedded Computing: The integration of massive compute power with on-chip memory (64MB SRAM) and numerous programmable I/O pins makes this architecture ideal for latency-sensitive embedded applications, signal processing, and specialized machine learning acceleration.
  • System-on-Chip (SoC) Innovation: The novel use of three wide mesh NoCs showcases an advanced solution for managing high-bandwidth communication traffic required by 1024 processing elements.
lock-1

Technical Deep Dive Available

This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.

Read Full Report →