Paper
Research
Programming Language Assisted Waveform Analysis: A Case Study on the Instruction Performance of SERV
Abstract
This paper addresses the difficulty of fine-grained RISC-V core analysis in a rapidly expanding ecosystem by proposing a programming language-assisted methodology. The innovation utilizes WAWK, a front-end for the Waveform Analysis Language,
Research
Optimized Real-Time Assembly in a RISC Simulator
Abstract
This article introduces the Assembly/Simulation Platform for Illustration of RISC-V in Education (ASPIRE), an integrated tool designed to teach RISC-V ISA and CPU architecture concepts. The authors evaluate two assembly algorithms
Research
Tensor Slicing and Optimization for Multicore NPUs
Abstract
This paper introduces the Tensor Slicing Optimization (TSO) pass for the TensorFlow XLA/LLVM compiler, designed to improve CNN performance on highly constrained Multicore Neural Processor Units (NPUs). TSO efficiently partitions convolution
Research
DARKSIDE: A Heterogeneous RISC-V Compute Cluster for Extreme-Edge On-Chip DNN Inference and Training
Abstract
DARKSIDE is a System-on-Chip featuring a heterogeneous cluster of eight RISC-V cores designed for extreme-edge (TinyML) DNN inference and training, integrating 2-bit to 32-bit mixed-precision integer capabilities. To boost performance, the cluster
Research
Simulation Environment with Customized RISC-V Instructions for Logic-in-Memory Architectures
Abstract
Addressing the challenges of the memory wall, this work proposes using customized RISC-V instructions to support Logic-in-Memory (LiM) operations within computing architectures. The key innovation is a modular, cycle-accurate simulation environment developed
Research
Hybrid Modular Redundancy: Exploring Modular Redundancy Approaches in RISC-V Multi-Core Computing Clusters for Reliable Processing in Space
Abstract
This paper presents Hybrid Modular Redundancy (HMR), a novel fault-tolerance scheme designed for RISC-V multi-core computing clusters intended for space applications. HMR enables flexible, on-demand dual-core or triple-core lockstep grouping with runtime