Research
A Flexible Template for Edge Generative AI with High-Accuracy Accelerated Softmax & GELU
Abstract
This paper introduces a BFloat16 RISC-V acceleration template for edge Generative AI, specifically addressing the performance bottleneck caused by softmax and GELU non-linearities in Transformer models. The innovation lies in SoftEx, a