Architecture
Research
CVA6 RISC-V Virtualization: Architecture, Microarchitecture, and Design Space Exploration
Abstract
This article describes the implementation and optimization of hardware virtualization support for the open-source RISC-V CVA6 core, encompassing architecture and microarchitecture enhancements. The authors introduce specific structures like the G-Stage TLB (GTLB)
Research
RedMule: A Mixed-Precision Matrix-Matrix Operation Engine for Flexible and Energy-Efficient On-Chip Linear Algebra and TinyML Training Acceleration
Abstract
RedMulE is a specialized, mixed-precision matrix multiplication engine designed to enable energy-efficient TinyML training, which traditionally requires costly floating-point operations. Integrated into a RISC-V PULP cluster, the engine supports FP16 and hybrid
Research
TinyVers: A Tiny Versatile System-on-chip with State-Retentive eMRAM for ML Inference at the Extreme Edge
Abstract
TinyVers is an ultra-low power, versatile System-on-Chip designed for Machine Learning inference at the Extreme Edge, integrating a RISC-V host processor and a dataflow reconfigurable ML accelerator. The SoC leverages state-retentive embedded
Research
BARVINN: Arbitrary Precision DNN Accelerator Controlled by a RISC-V CPU
Abstract
BARVINN is an open-source DNN accelerator that enables deep learning inference at arbitrary precision using dedicated, bit-level configurable Processing Elements (PEs). The system is efficiently managed by a dedicated RISC-V CPU controller,
Research
Generic Tagging for RISC-V Binaries
Abstract
The paper introduces COGENT, a generic instruction tag generator designed for RISC-V binaries to simplify the implementation of custom hardware security solutions without requiring specialized compilers. COGENT associates configurable tags (1 to
Research
TCN-CUTIE: A 1036 TOp/s/W, 2.72 uJ/Inference, 12.2 mW All-Digital Ternary Accelerator in 22 nm FDX Technology
Abstract
TCN-CUTIE is a novel, all-digital ternary neural network accelerator implemented in 22 nm FDX technology and integrated into a RISC-V SoC, designed specifically for stringent TinyML constraints. It achieves a record peak