An Embedded RISC-V Core with Fast Modular Multiplication
Abstract
This work introduces an embedded RISC-V core designed to address the security and power consumption challenges inherent in battery-operated IoT devices by accelerating cryptographic operations. The key innovation is an extended custom instruction for modular multiplication that utilizes a Partial Execution mode, blocking the processor for only two cycles regardless of the multiplication size. This design achieves up to a 13x speedup and 95% power reduction on elliptic-curve cryptography benchmarks while maintaining an operational frequency of 136MHz on ASIC.
Report
Key Highlights
- Core Innovation: A custom instruction extension added to an embedded RISC-V core specifically for accelerating modular multiplication.
- Performance Metric: Achieves up to a 13x speedup compared to standard software implementations.
- Power Efficiency: Reduces overall power consumption by up to 95%, critical for battery-operated IoT nodes.
- Minimal Latency: The custom instruction, when used in Partial Execution mode, blocks the processor for typically only two cycles, regardless of the size of the operation.
- Operating Frequency: The CPU with 128-bit modular multiplication runs at 136MHz on ASIC and 81MHz on FPGA.
Technical Details
- Base Architecture: The CPU proof-of-concept adopted the embedded (E) and compressed (C) extensions of the RISC-V instruction set.
- Acceleration Method: The modular multiplication is handled via an extended custom instruction, providing greater flexibility compared to fixed-function hardware accelerators.
- Area Overhead: The inclusion of the accelerator resulted in an average area overhead of 41% over the base RISC-V architecture.
- Cryptography Focus: The design was benchmarked specifically using recent algorithms in the field of Elliptic-Curve Cryptography (ECC).
- Scalability: The custom instruction is designed to handle modular multiplication of “any size” while maintaining the two-cycle block characteristic.
Implications
- IoT Security: This development significantly lowers the power budget required for robust encryption and authentication on resource-constrained IoT end-nodes, resolving a major security concern.
- RISC-V Extensibility: It showcases the power and flexibility of RISC-V’s custom instruction mechanism (custom instruction sets are superior to inflexible hardware accelerators).
- Real-time Responsiveness: By minimizing the processor stall time to just two cycles, the design maintains better device response time to real-time events, mitigating a major drawback of traditional custom instruction solutions.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.