Envisioning a Safety Island to Enable HPC Devices in Safety-Critical Domains

Envisioning a Safety Island to Enable HPC Devices in Safety-Critical Domains

Abstract

High Performance Computing (HPC) devices are increasingly required for autonomous systems, yet they often lack adequate, standardized support for safety-critical operations. This paper proposes a novel 'Safety Island' concept designed to complement HPC devices and deliver the necessary safety features. Crucially, this island is envisioned using open-source components based on the RISC-V ISA to ease adoption and offer a broad set of enabling features for safety applications.

Report

Key Highlights

  • HPC devices (multicores, accelerators like GPUs) are becoming the only viable option for delivering the required performance in complex safety-critical autonomous systems (e.g., autonomous cars, unmanned planes).
  • The support for realizing safety-critical systems directly on commercial HPC devices is currently heterogeneous and insufficient.
  • The paper introduces a specific concept for a "Safety Island" intended to be coupled with the main HPC device to meet stringent safety requirements.
  • The primary design goals for the proposed Safety Island are to offer a wide, comprehensive feature set and to utilize open-source components.
  • The authors explicitly base their Safety Island realization on the open-source RISC-V Instruction Set Architecture (ISA).

Technical Details

  • Architecture: A secondary, dedicated processing unit (the Safety Island) designed to operate alongside and complement the primary HPC device.
  • Goal-Oriented Design: The island is explicitly architected to facilitate the broadest possible set of safety applications for any coupled HPC device.
  • Core Technology: Realization is based entirely on open-source components, specifically leveraging the RISC-V ISA.
  • HPC Integration: Designed to address and mitigate the limitations of large, powerful multicores and accelerators like GPUs in safety-critical contexts.

Implications

  • Bridging Performance and Safety: This concept is vital for the tech ecosystem as it directly addresses the conflict between the need for high computational performance (AI/machine learning workloads in autonomy) and the absolute necessity of functional safety in critical domains.
  • Validation for RISC-V in Safety: The explicit reliance on the open-source RISC-V ISA validates the architecture's suitability for high-assurance and safety-critical applications. Since functional safety standards (like ISO 26262 for automotive) often require deep transparency and trust, an open-source hardware base is highly advantageous.
  • Promoting Standardization: By offering a standardized, open-source Safety Island, the approach aims to overcome the current issue of heterogeneous safety support found across proprietary HPC devices, promoting a more universal and trustworthy approach to complex system safety.
  • Ecosystem Growth: Using open components encourages community collaboration, potentially leading to rapid development of certified safety libraries, tools, and IP cores compatible with the Safety Island architecture.
lock-1

Technical Deep Dive Available

This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.

Read Full Report →