NPS: A Framework for Accurate Program Sampling Using Graph Neural Network

NPS: A Framework for Accurate Program Sampling Using Graph Neural Network

Abstract

Neural Program Sampling (NPS) is a novel framework that utilizes a Graph Neural Network (GNN) approach to accurately select representative simulation points, addressing the limitations and time-consuming manual tuning required by traditional SimPoint/BBV methods. NPS leverages its specialized GNN, AssemblyNet, to learn rich execution embeddings from dynamic program snapshots, capturing aspects like data flow and code path behavior. In experiments, NPS significantly outperforms SimPoint by up to 63% in accuracy, reducing the average error by 38% and enabling faster, more agile microprocessor innovation, particularly for platforms like RISC-V.

Report

Key Highlights

  • Innovation: Introduction of Neural Program Sampling (NPS), a GNN-based framework designed to modernize and accelerate program sampling for microprocessor design.
  • Problem Solved: NPS addresses the limited expressiveness of the decades-old Basic Block Vector (BBV) used by SimPoint, which currently necessitates months of time-consuming human tuning.
  • Performance Improvement: NPS significantly outperforms the traditional SimPoint approach, achieving up to 63% higher accuracy in experiments.
  • Efficiency Gain: The framework reduces the average sampling error by 38%, drastically minimizing the overhead associated with expensive accuracy tuning.
  • Model Superiority: NPS demonstrates higher accuracy and generality compared to existing state-of-the-art GNN approaches used in code behavior learning.

Technical Details

  • Framework Architecture: NPS uses dynamic program snapshots to generate execution embeddings, which are then used for representative sample selection.
  • Core GNN Model: AssemblyNet serves as the specialized Graph Neural Network architecture within NPS.
  • Embedding Generation: AssemblyNet is engineered to learn a program's behavior by capturing specific runtime and structural characteristics, including data computation, code path execution, and detailed data flow.
  • Training Task: The GNN model (AssemblyNet) is trained using a predictive auxiliary task, specifically a data prefetch task that involves predicting consecutive memory addresses.
  • Data Representation: NPS moves beyond the simple Basic Block Vector (BBV) representation, utilizing learned, high-quality execution embeddings.

Implications

  • Accelerated Architectural Innovation: By automating and increasing the accuracy of program sampling, NPS removes a significant bottleneck in the hardware design cycle, enabling much faster iteration and development.
  • Support for RISC-V Extensions: This framework is crucial for the modern processor landscape, supporting the growing demand for rapid architectural innovations, particularly the design and validation of custom RISC-V extensions.
  • Improved Simulation Fidelity: The high-quality execution embeddings generated by NPS lead to more representative simulation points, improving the reliability and fidelity of workload simulation results used for performance evaluation.
  • ML Integration in Hardware Design: NPS represents a successful shift toward integrating advanced Machine Learning and AI techniques (GNNs) into core hardware architecture methodologies, streamlining complex and previously manual optimization tasks.
lock-1

Technical Deep Dive Available

This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.

Read Full Report →