ControlPULP: A RISC-V On-Chip Parallel Power Controller for Many-Core HPC Processors with FPGA-Based Hardware-In-The-Loop Power and Thermal Emulation

ControlPULP: A RISC-V On-Chip Parallel Power Controller for Many-Core HPC Processors with FPGA-Based Hardware-In-The-Loop Power and Thermal Emulation

Abstract

ControlPULP is an open-source, HW/SW RISC-V parallel Power Controller System (PCS) designed to manage complex, real-time closed-loop power and thermal requirements in many-core HPC processors. The architecture utilizes a single-core MCU coupled with a multi-core programmable cluster to accelerate advanced power management policies. This approach achieves a significant 4.9x speedup in control execution while occupying a minimal 0.1% area overhead on the CPU die, validated through an FPGA-based hardware-in-the-loop emulation framework.

Report

Structured Report: ControlPULP Analysis

Key Highlights

  • RISC-V Parallel PCS: ControlPULP is an open-source, hardware/software Power Controller System (PCS) built on RISC-V, designed for high-bandwidth, real-time control in modern many-core HPC chips.
  • Performance Acceleration: The system achieves a 4.9x speedup in the execution of the Power Control Firmware (PCF) compared to single-core execution, enabling more advanced Multi-Input Multi-Output (MIMO) management algorithms.
  • Minimal Overhead: ControlPULP maintains a shallow area overhead, consuming approximately 0.1% of the area of a modern HPC CPU die.
  • Validation Framework: The system was assessed using a novel FPGA-based, closed-loop Hardware-In-The-Loop emulation framework, providing high-fidelity testing.
  • Accuracy: The system demonstrated effective Dynamic Voltage and Frequency Scaling (DVFS) tracking with a mean deviation within 3% of the plant's Thermal Design Power (TDP).

Technical Details

  • Architecture: ControlPULP is a parallel PCS platform consisting of two main components: a single-core MCU optimized for fast interrupt handling, and a scalable multi-core programmable cluster accelerator.
  • Acceleration Mechanisms: The platform includes a specialized DMA engine designed for the parallel acceleration of real-time power management policies within the control hyper-period.
  • Software Stack: The control system runs the FreeRTOS operating system to manage and schedule the reactive PCF application layer.
  • Use Case: The research demonstrates the platform's capabilities by targeting the power management requirements of a next-generation 72-core HPC processor.
  • Emulation Technique: Assessment relied on an FPGA-based closed-loop emulation framework utilizing the heterogeneous SoCs paradigm, allowing direct comparison with software-equivalent model-in-the-loop approaches.

Implications

  • Advancing RISC-V in HPC Control: ControlPULP establishes a strong, open-source reference architecture for deploying complex on-chip control functions using RISC-V. This expands RISC-V's role from primary computation cores into essential cyber-physical management infrastructure.
  • Scalability for Many-Core Design: By moving beyond traditional simple MCU-class cores to a scalable parallel cluster, ControlPULP provides the necessary computational bandwidth to execute advanced, complex MIMO power and thermal control algorithms required by chips featuring tens or hundreds of cores.
  • Enabling Optimal Power Management: The significant 4.9x speedup allows system designers to implement computationally intensive, predictive, and optimal control policies that were previously too slow for the real-time constraints, leading to superior energy efficiency and thermal stability under heavy workload conditions.
  • Standardized Validation: The development of the FPGA-based emulation framework offers a crucial tool for hardware architects, providing a robust, repeatable, and high-fidelity method to verify the control system's effectiveness against real-world power and thermal plant models before expensive silicon manufacturing.
lock-1

Technical Deep Dive Available

This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.

Read Full Report →