Envisioning a Safety Island to Enable HPC Devices in Safety-Critical Domains
Abstract
High Performance Computing (HPC) devices are increasingly required for autonomous systems, yet they often lack adequate, standardized support for safety-critical operations. This paper proposes a novel 'Safety Island' concept designed to complement HPC devices and deliver the necessary safety features. Crucially, this island is envisioned using open-source components based on the RISC-V ISA to ease adoption and offer a broad set of enabling features for safety applications.
Report
Key Highlights
- HPC devices (multicores, accelerators like GPUs) are becoming the only viable option for delivering the required performance in complex safety-critical autonomous systems (e.g., autonomous cars, unmanned planes).
- The support for realizing safety-critical systems directly on commercial HPC devices is currently heterogeneous and insufficient.
- The paper introduces a specific concept for a "Safety Island" intended to be coupled with the main HPC device to meet stringent safety requirements.
- The primary design goals for the proposed Safety Island are to offer a wide, comprehensive feature set and to utilize open-source components.
- The authors explicitly base their Safety Island realization on the open-source RISC-V Instruction Set Architecture (ISA).
Technical Details
- Architecture: A secondary, dedicated processing unit (the Safety Island) designed to operate alongside and complement the primary HPC device.
- Goal-Oriented Design: The island is explicitly architected to facilitate the broadest possible set of safety applications for any coupled HPC device.
- Core Technology: Realization is based entirely on open-source components, specifically leveraging the RISC-V ISA.
- HPC Integration: Designed to address and mitigate the limitations of large, powerful multicores and accelerators like GPUs in safety-critical contexts.
Implications
- Bridging Performance and Safety: This concept is vital for the tech ecosystem as it directly addresses the conflict between the need for high computational performance (AI/machine learning workloads in autonomy) and the absolute necessity of functional safety in critical domains.
- Validation for RISC-V in Safety: The explicit reliance on the open-source RISC-V ISA validates the architecture's suitability for high-assurance and safety-critical applications. Since functional safety standards (like ISO 26262 for automotive) often require deep transparency and trust, an open-source hardware base is highly advantageous.
- Promoting Standardization: By offering a standardized, open-source Safety Island, the approach aims to overcome the current issue of heterogeneous safety support found across proprietary HPC devices, promoting a more universal and trustworthy approach to complex system safety.
- Ecosystem Growth: Using open components encourages community collaboration, potentially leading to rapid development of certified safety libraries, tools, and IP cores compatible with the Safety Island architecture.
Technical Deep Dive Available
This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.