Aquas: Enhancing Domain Specialization through Holistic Hardware-Software Co-Optimization based on MLIR

Aquas: Enhancing Domain Specialization through Holistic Hardware-Software Co-Optimization based on MLIR

Abstract

Aquas is a novel, MLIR-based holistic hardware-software co-design framework addressing the performance limitations of existing open-source RISC-V ASIP frameworks. It enhances specialization via hardware improvements like a burst DMA engine and advanced HLS optimizations, coupled with an e-graph based retargetable compiler featuring a novel instruction matching engine. This approach enables significant performance gains, demonstrating up to 9.27x speedup on real-world workloads, including LLM inference and point cloud processing.

Report

Aquas: Enhancing Domain Specialization through Holistic Hardware-Software Co-Optimization based on MLIR

Key Highlights

  • Holistic Co-Design: Aquas is introduced as a comprehensive framework for Application-Specific Instruction-Set Processors (ASIPs) based on RISC-V, focusing on simultaneous hardware and software optimization.
  • MLIR Foundation: The framework leverages the Multi-Level Intermediate Representation (MLIR) to unify the optimization pipeline across both compiler and synthesis stages.
  • Performance Breakthrough: Evaluation shows the effectiveness of Aquas, achieving an impressive speedup of up to 9.27x on challenging real-world tasks.
  • Target Domains: The achieved speedups were validated on high-performance computing tasks such as point cloud processing and Large Language Model (LLM) inference.

Technical Details

  • Compiler Infrastructure: The software side utilizes an advanced retargetable compiler built on an e-graph structure, ensuring optimized instruction selection.
  • Novel Matching Engine: A new matching engine is proposed within the e-graph based compiler to enable more efficient and accurate instruction matching for specialized hardware.
  • Hardware Synthesis Enhancements: Hardware specialization is improved through the integration of advanced High-Level Synthesis (HLS) optimizations.
  • Memory Access: To overcome memory bandwidth bottlenecks, the framework incorporates a burst Direct Memory Access (DMA) engine, providing fast memory access capabilities to the synthesized ASIP hardware.

Implications

  • Advancing RISC-V Specialization: Aquas directly tackles the performance limitations and rigidity found in current open-source RISC-V ASIP frameworks, accelerating the deployment of highly customized, domain-specific hardware.
  • AI/ML Acceleration: By achieving near 10x speedups on workloads like LLM inference, Aquas proves critical for improving energy efficiency and throughput in high-demand AI applications running on specialized RISC-V cores.
  • MLIR Ecosystem Validation: The success of Aquas demonstrates the power of MLIR as a necessary, unified infrastructure for complex hardware-software co-design flows, particularly in emerging heterogeneous computing environments.
  • Democratization of ASIP Design: Providing a high-performance, holistic open-source framework lowers the barrier for researchers and companies to design and optimize specialized RISC-V hardware for niche applications.
lock-1

Technical Deep Dive Available

This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.

Read Full Report →