Abstract
In the context of safety-critical applications on SoC architectures, safety monitoring is having a revival and gaining increasing importance. ARM Reference Design 1 (RD1) implements a three-level approach to runtime monitoring of functional software, aiming at cost-efficient separation between application and monitoring. This article analyzes the technical foundations of this approach, evaluates its strengths and limitations, and outlines its implications for software and system development – particularly with regard to redundancy, diagnostic coverage, and common-mode risks.
Introduction
Integrating safety-relevant functions into modern SoCs presents significant challenges for developers. This is especially true in mixed-criticality systems, where safety-critical and non-safety-critical software run side by side. Structured monitoring becomes essential in such environments. ARM Reference Design 1 (RD1) addresses this need with a standardized concept that combines three different monitoring layers: application, software monitor, and hardware monitoring. While the concept promises scalability and modularity, it also reaches systemic limits – particularly at higher ASIL requirements.
Architectural Overview
RD1 assigns functional software, monitoring instances, and hardware diagnostics to clearly separated IP blocks. The application itself runs in the so-called High-Performance Island (HP), typically on a Cortex-A-based core under a high-performance operating system such as Linux. The corresponding monitoring instance – referred to as the Function Monitor – is implemented in the Safety Island (SI), usually on a Cortex-R or Cortex-M core running an RTOS as Zephyr. Hardware-side fault detection is provided by RAS mechanisms (Reliability, Availability, Serviceability), which capture random faults and relay them via interrupts to the software monitor (et al).

Software Architecture and Runtime Monitoring
The application in the HP Island is generally developed as Quality Managed (QM) software. It is not subject to the stringent requirements of functional safety as per ISO 26262, as it is not classified with an ASIL rating. While some projects apply systematic error-avoidance measures, a safety-oriented architecture is not part of this software path.
Monitoring is handled by a separate software instance in the Safety Island. In RD1’s baseline concept, this is limited to alive monitoring – a heartbeat mechanism checks whether the application sends signals at regular intervals. Semantic or content-based validation – such as input value checks or actuator state monitoring – is not performed, as the application is not generically interpretable from the monitor’s perspective. This significantly reduces the functional scope of the monitoring.
Hardware Diagnostics and Fault Classification
RAS mechanisms complement software monitoring by providing hardware-side detection of random faults, such as ECC errors in RAM or caches. Detection occurs in real time and is typically forwarded to the monitoring instance via interrupt. RAS is especially effective in detecting transient faults that occur briefly and resolve themselves – typically caused by electromagnetic interference. Latent faults can also be detected under certain conditions, provided that sufficient diagnostic coverage is implemented – what a hell of work.
Systemic Risks and Architectural Limitations
Despite the separation between HP and Safety Islands, architectural couplings remain – such as shared memory regions, buses, or interrupt sources. These shared resources represent potential common cause faults and possible single points of failure – particularly in cases of voltage drops, clock anomalies, or interconnect failures. While separation at the operating system level is typically ensured, physical isolation is often incomplete, which must be critically evaluated in safety-oriented applications. Looking forward to ARM MPAM with ISA 8.4.
Development Implications and Redundancy Requirements
From a development perspective, the RD1 concept introduces increased complexity. Two software paths must be independently developed, tested, and validated: the application itself and the function monitor. Since the latter cannot perform complete functional monitoring, additional system-level redundancy is required – for instance, through dual-redundant sensors (2oo2) or diverse actuator paths. Without such measures, the system lacks the ability to detect and compensate for functional failures.
Furthermore, RD1 does not, but technically could, offer true functional redundancy in the sense of lockstep-based dual-core systems or 2oo2 architectures. Monitoring remains limited to structural and temporal aspects, clearly restricting its applicability to ASIL-B and selected ASIL-C functions.
Remember: To determine an Automotive Safety Integrity Level (ASIL), developers and engineers evaluate three key factors:
- Controllability: This assesses the likelihood that a typical driver or operator could recognize the hazard in time and take appropriate action to avoid injury.
- Severity: This refers to the potential seriousness of harm or injury that could result from a hazardous event.
- Exposure: This considers how often the operational or environmental conditions occur that could lead to such a hazardous event. In other words, it reflects how frequently a situation arises where the system is at risk—making the frequency of use or occurrence a central aspect of this factor.
Conclusion
ARM RD1 provides a well-considered, modular safety monitoring concept for heterogeneous SoCs. Its three-level approach combines software and hardware mechanisms for runtime monitoring but is primarily suited for applications with medium safety requirements. The lack of functional redundancy and the risk of architecture-related common-mode failures significantly limit its suitability for higher ASIL levels. For developers, this means that additional safety measures outside the RD1 core are necessary – especially concerning sensor/actuator redundancy, communication paths, and fault response strategies. KUDOS to the ARM team. Well done.