Aquas: Enhancing Domain Specialization through Holistic Hardware-Software Co-Optimization based on MLIR
Abstract
Aquas is a novel, MLIR-based holistic hardware-software co-design framework addressing the performance limitations of existing open-source RISC-V ASIP frameworks. It enhances specialization via hardware improvements like a burst DMA engine and advanced HLS optimizations, coupled with an e-graph based retargetable compiler featuring a novel instruction matching engine. This approach enables significant performance gains, demonstrating up to 9.27x speedup on real-world workloads, including LLM inference and point cloud processing.
Report
Aquas: Enhancing Domain Specialization through Holistic Hardware-Software Co-Optimization based on MLIR
Key Highlights
- Holistic Co-Design: Aquas is introduced as a comprehensive framework for Application-Specific Instruction-Set Processors (ASIPs) based on RISC-V, focusing on simultaneous hardware and software optimization.
- MLIR Foundation: The framework leverages the Multi-Level Intermediate Representation (MLIR) to unify the optimization pipeline across both compiler and synthesis stages.
- Performance Breakthrough: Evaluation shows the effectiveness of Aquas, achieving an impressive speedup of up to 9.27x on challenging real-world tasks.
- Target Domains: The achieved speedups were validated on high-performance computing tasks such as point cloud processing and Large Language Model (LLM) inference.
Technical Details
- Compiler Infrastructure: The software side utilizes an advanced retargetable compiler built on an e-graph structure, ensuring optimized instruction selection.
- Novel Matching Engine: A new matching engine is proposed within the e-graph based compiler to enable more efficient and accurate instruction matching for specialized hardware.
- Hardware Synthesis Enhancements: Hardware specialization is improved through the integration of advanced High-Level Synthesis (HLS) optimizations.
- Memory Access: To overcome memory bandwidth bottlenecks, the framework incorporates a burst Direct Memory Access (DMA) engine, providing fast memory access capabilities to the synthesized ASIP hardware.
Implications
- Advancing RISC-V Specialization: Aquas directly tackles the performance limitations and rigidity found in current open-source RISC-V ASIP frameworks, accelerating the deployment of highly customized, domain-specific hardware.
- AI/ML Acceleration: By achieving near 10x speedups on workloads like LLM inference, Aquas proves critical for improving energy efficiency and throughput in high-demand AI applications running on specialized RISC-V cores.
- MLIR Ecosystem Validation: The success of Aquas demonstrates the power of MLIR as a necessary, unified infrastructure for complex hardware-software co-design flows, particularly in emerging heterogeneous computing environments.
- Democratization of ASIP Design: Providing a high-performance, holistic open-source framework lowers the barrier for researchers and companies to design and optimize specialized RISC-V hardware for niche applications.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.