ISO 26262-Compliant Safety Analysis Methods

June 15, 2022

The development of safety-related electrical and electronic (E/E) systems in the automotive industry is primarily associated with functional safety. A crucial aspect of functional safety system development is conducting safety analyses in compliance with the ISO 26262 standard. This standard provides recommendations for the methodology of safety analyses. What analytical methods are available for automotive safety-related systems? How to classify and apply these methods effectively in the context of safety system analyses to support the ISO 26262-compliant product development? This article addresses these questions, and introduces the analytical methods used in the development of safety-related systems in the automotive industry, along with the best practices.

The Purpose of Safety Analyses

Starting with the purpose of safety analyses in the automotive industry: why is it necessary to conduct safety analyses when developing safety-related automotive E/E systems?

ISO 26262 standard defines functional safety as: "absence of unreasonable risk due to hazards caused by malfunctioning behavior of E/E systems" [ISO 26262:2018]. Typically, malfunctions in E/E systems are caused by two types of failure:

Systematic failure: Failures that are related to a cause and can only be eliminated by modifying the designs or manufacturing processes, operating procedures, documents, or other associated elements.
Random hardware failure: Failures that occur unintentionally within the life cycle of hardware elements and follow a probabilistic distribution.

Consequently, the aim of safety analyses is to ensure that the risk of safety goal violations due to systematic or random failures is sufficiently low.

It is important to note that, according to ISO 26262, the analysis of systematic failures does not discuss their probability of occurrence. However, measures against systematic failures help to reduce the overall risk of safety goals or requirement violations.

Scope of Safety Analyses

The scope of safety analyses includes:

Validation of safety goals and/or safety concepts
Verification of safety concepts and/or safety requirements
Identification of conditions and causes (including faults/failures) that could lead to the violation of a safety goal or safety requirement
Identification of additional safety requirements for detection of faults/failures
Determination of the required responses to detected faults/failures
Identification of additional measures to verify that the safety goals or safety requirements are complied with

Implementation of Safety Analyses

Depending on the specific application, safety analyses can be conducted through:

Identify new hazards not previously identified during the HARA
Identify faults or failures that can lead to violations of safety goals/safety requirements
Identify potential causes of faults/failures
Support the definition of safety measures for fault prevention/fault control
Provide evidence for the applicability of safety concepts
Support the verification of safety concepts, safety requirements
Support the verification of design and test requirements

For the related items to be analyzed, based on their safety concepts, safety goals are derived from Hazard Analysis and Risk Assessment (HARA), and the safety requirements are subsequently established. Further consideration is given to potential faults or failures, leading to the determination of additional safety requirements for detecting these faults and failures. Then, according to detected faults or failures, the following processes or measures are determined. Finally, additional measures are determined to verify whether the implemented safety measures meet the corresponding safety requirements and/or goals.

Introduction to Safety Analysis Methods

Qualitative and Quantitative Methods

1. Qualitative Safety Analysis Methods

The methods of qualitative safety analysis primarily include:

Qualitative Failure Mode and Effect Analysis (FMEA) at the system, design, or process level
Qualitative Fault Tree Analysis (FTA)
Hazard and Operability Analysis (HAZOP)
Qualitative Event Tree Analysis (ETA)

Qualitative analysis methods are particularly suitable for software safety analyses where no other specific methods are appropriate.

2. Quantitative Safety Analysis Methods

Quantitative safety analysis methods complement qualitative safety analysis methods and are primarily utilized for evaluation based on hardware architectural metrics and random hardware failure rates. The resulting safety goals violate the defined goals of the assessment to validate the hardware design (please refer to ISO 26262-5:2018, Clause 8 and 9). Quantitative safety analysis requires additional information on quantitative failure rates of hardware elements.

The methods of quantitative safety analyses primarily include:

Quantitative FMEA
Quantitative FTA
Quantitative ETA
Markov model
Reliability Block Diagrams (RBDs) analysis

3. Differences and Connections Between Quantitative and Qualitative Analyses

This section clarifies the relationship and differences between quantitative and qualitative analyses.

3.1 Differences Between Quantitative and Qualitative Analyses

The difference is that quantitative analyses predict failure rates, whereas qualitative analyses identify faults but do not predict their failure rates. Qualitative safety analysis methods are general and can be applied at the system level, hardware level, and software level. Conducting quantitative safety analyses require additional knowledge of the quantitative failure rates of the relevant hardware elements. In ISO 26262, quantitative analyses are used to validate the assessments of hardware architectural design metrics that evaluate violations of safety goals due to random hardware failures.

3.2 Connections Between Quantitative and Qualitative Analyses

Both methods rely on understanding relevant failure types or failure modes. Quantitative safety analyses act as complements to qualitative analyses. Two methods should be used in combination in engineering applications.

Inductive and Deductive Analyses

In addition to qualitative and quantitative methods, safety analysis methods can also be categorized by their approach as inductive and deductive analyses.

1. Introduction to Inductive and Deductive Analyses

Inductive analyses, known as bottom-up methods, start from known causes and work upward to determine the potential consequences of these causes, thereby identifying possible failures. In contrast, deductive analyses are a top-down approach, beginning with known consequences and seeking out possible causes.

Common Safety Analysis Methods

There are various safety analysis methods used in engineering applications. For example, FMEA and FTA are two common methods for analyzing faults and failures of items and elements within the ISO 26262 framework. If the developed system has specific Automotive Safety Integrity Level (ASIL) requirements, Failure Modes, Effects and Diagnostic Coverage Analysis (FMEDA) is typically implemented. Additionally, ETA and RBD can also be applied to conduct safety-related analyses.

Failure Mode and Effects Analysis (FMEA)

Failure Mode and Effects Analysis (FMEA) is one of the earliest fault analysis techniques, developed by reliability engineers in the late 1940s to study possible failures that may arise from military systems. It was later adopted by the automotive industry as international standards in the 1970s. The most widely used procedures at present are recorded in the handbook published by Automotive Industry Action Group (AIAG) and German Association of the Automotive Industry (VDA) in 2019 (see Figure 1). This handbook assists suppliers to support their development work.

The handbook was developed by original equipment manufacturers (OEMs) and tier 1 supplier matter experts (SMEs), integrating the best practices from AIAG and VDA to form structural methodologies, covering Design FMEA, Process FMEA, and additional content for FMEA for Monitoring and System Response. Primarily targeting technical risks, FMEA serves as an analytical method used for preventive quality management and monitoring in product design and manufacturing processes.

Figure 2. FMEA Diagram, Bottom-up Approach - ISO 26262-10:2018(E)

The primary characteristics of FMEA is that it begins with analyzing the causes of failures of each architectural element and then deduces the impacts to the overall system, and thus to develop optimization measures for potentially unacceptable failures. In typical automotive applications, FMEA can be conducted by qualitative or quantitative methods in analyzing failures and faults in safety system designs. Generally implemented as an inductive (bottom-up) approach (see Figure 2), FMEA focuses on how failures occur within system components, and how these faults impact the overall system.

Figure 3. FTA Diagram, Top-down Approach - ISO 26262 - 10:2018(E)

Failure Modes, Effects and Diagnostic Coverage Analysis (FMEDA)

The Failure Modes, Effects and Diagnostic Coverage Analysis (FMEDA) method was initially developed by exida in the 1990s, and was adopted as a recommended analysis method in the ISO 26262 functional safety standard in 2011. FMEDA can be seen as a quantitative extension of FMEA, as it considers quantitative failure rates of hardware elements. This includes the failure rates and the distributions of failure modes for these elements, while also considering the safety mechanisms for the corresponding failure modes and their diagnostic coverage to detect critical failure modes.

FMEDA is mainly utilized during hardware architectural design and hardware detailed design. At the hardware design level, it is essential to calculate the hardware architectural metrics, such as Single-Point Fault Metric (SPFM) and Latent Fault Metric (LFM). Iterative applications of FMEDA can improve hardware designs.

Fault Tree Analysis (FTA)

Fault Tree Analysis (FTA), developed by Bell Labs in the early 1960s, was used to evaluate the launch systems of ballistic missiles. This analysis method was then standardized by International Electronical Commission (IEC) in 2006 and has been referenced in automotive industry standards such as ISO 26262 as a potential or recommended analysis method.

FTA can be applied in both qualitative and quantitative techniques. For example, starting with qualitative fault analysis, quantitative statistics can then be integrated to strengthen the analysis and result in a quantitative variant of the analysis.

In contrast to FMEA, FTA is a deductive (top-down) method (see Figure 3) that enables the identification of base events or combinations of base events that may lead to the defined top event failure. Typically, as an undesirable system event, the top event can violate a safety goal or the safety requirements derived from a safety goal.

To perform FTA, it is possible to start with the top undesirable event, then progressively builds a graphical tree structure. The interaction of potential causes for the undesirable even is represented by Boolean logic operations, such as AND, OR, and NOT gates. The quantitative variant of FTA can be implemented to calculate the third Probabilistic Metric for random Hardware Failures (PMHF) metric, which is also a recommended method in ISO 26262.

Figure 4. Classification and Integration of Analytical Methods

Comprehensive Application of Safety Analysis Methods

In practical engineering applications, the inductive and deductive methods can be combined to form the classification scheme shown in figure 4. In the development of safety-critical E/E systems, combining top-down methods (such as FTA) with bottom-up methods (such as FMEA) can identify detailed failure modes of semiconductor components, which can then be applied at the element level. Starting from a lower level of abstraction, a quantitatively precise failure distribution assessment of semiconductor components can be performed, with the failure distribution based on qualitative distribution assumptions.

Figure 5. FTA and FMEA Combined Analyses Diagram - ISO 26262 - 10:2018(E)

E/E systems consist of numerous components and sub-components. FTA and FMEA can be combined to provide a complementary safety analysis method that balances top-down and bottom-up approaches. Figure 5 illustrates the possible combination of FTA and FMEA. The base events in the figure originate from different FMEAs (marked as FMEA A-E in this example). These base events are derived from analyses conducted at a lower level of abstraction (such as sub-components, components or modules). In this case, base event 1 and 2 are derived from faults detected in FMEA D, while faults from FMEA B are not used in FTA.

Figure 6. Safety Analyses in the Safety Life Cycle

Safety Analysis Methods in the Safety Life Cycle

Mapping of Safety Analyses and Safety Life Cycle

ISO 26262 standard refers to the safety life cycle that includes key safety activities during the concept phase, namely product development, production, operation, service and decommissioning. As a crucial aspect of the product development process, safety analyses must be implemented at the system level, hardware level and software level. The level of detail in the fault model description during safety analysis depends on the level of detail analyzed in the corresponding development sub-phase and is consistent within that sub-phase (see Figure 6). For example, during the concept phase, safety analyses are performed based on the initial architecture at the appropriate level of abstraction. In the product development phase, the necessary level of detail for analysis may depend on the specific analysis phase and the safety mechanisms applied .

Figure 7. Safety Analyses in the Hardware Design Phase of the Safety Life Cycle

Safety analyses are typically associated with design phase activities, such as the concept phase, system development and hardware development phases. These analyses are associated with activities in the concept, system and hardware development phases, such as architectural design and integration verification of system hardware (see Figure 7). Similarly, during the software development phase, safety analyses are linked to software development activities, such as software architectural unit design and verification activities (see Figure 8).

Figure 8. Safety Analyses in the Software Design Phase of the Safety Life Cycle

Safety Analyses in the Concept Phase

In the concept phase of functional safety, ISO 26262 recommends implementing qualitative safety analyses to support the derivation of valid functional safety requirements, especially mentions FMEA, FTA, and HAZOP as suggested methods. In the technical safety concept phase of system development, a qualitative safety analysis of the system architectural design should first be conducted to provide evidence for the suitability of the system design, specifying safety-related functions and attributes, such as analyzing requirements for independence or requirements for freedom from interference (FFI) within system components or between them, as well as identifying the causes of failures and the effects of faults. Moreover, if safety-related system elements and interfaces have already been defined, safety analyses can identify or confirm unknown new safety elements and interfaces. Finally, safety analyses support the design specification, and verify the effectiveness of safety mechanisms based on identified causes and effects of faults.

Considering the potential adverse effects of SOTIF and cybersecurity on achieving functional safety contributes to the holistic development of safe E/E systems. Similar considerations apply to the subsequent phases of development, and the content of SOTIF and cybersecurity are beyond the scope of this article.

Figure 9. Fault Classification of Safety-Related Hardware Components for Relevant Items - ISO 26262 - 5:2018(E)

Safety Analyses in Hardware Design Phase

1. Qualitative Safety Analyses in Hardware Design Phase

In the hardware design phase, various safety analysis techniques are applied in combination. One aspect is qualitative safety analysis of the hardware design. For example, the qualitative FTA method aids in identifying the causes of failures and the effects of faults. For safety-related hardware components/parts, the qualitative FMEA method helps to identify different types of faults, particularly those classified as safe faults, single-point faults or residual faults, and multiple-point faults. Similarly, according to the recommendations of the ISO 26262 standard, it is suggested to use a combination of deductive and inductive analysis methods for safety analyses.

2. Quantitative Safety Analyses in Hardware Design Phase

On the other hand, when discussing random hardware faults, it is necessary to perform quantitative safety analyses related to hardware design. Quantitative safety analyses assist in evaluating or calculating metrics related to hardware architectural design. Hardware architectural design metrics include Single-Point Fault Metric (SPFM), Latent Fault Metric (LFM), and Probabilistic Metric for Random Hardware Failures (PMHF). Typically, FMEDA methods are conducted in quantitative analyses to evaluate the suitability of the hardware architectural design with respect to the detection and control of safety-related random hardware failures. This is done by analyzing scenarios where safety goals are violated due to random hardware failures, and thus calculating specific metrics for the hardware architectural design.

2.1. Fault Classification of Safety-Related Hardware Components

Faults occurring in safety-related hardware components are categorized as:

a） Single-point fault

b） Residual fault

c） Multiple-point fault

d） Safe fault

Multiple-point faults need to be differentiated between latent, detected and perceived faults. Thus, safety-related hardware components faults are categorized as in Figure 9.

Figure 10. Classification of Failure Modes for Hardware Elements

Among these：

-Distance n indicates the number of independent faults that simultaneously lead to a violation of safety goals (single-point or residual faults n = 1, dual-point faults n = 2, etc.).

-Faults at distance n are located between the n-ring and n-1-ring.

-Unless explicitly related to technical safety concepts, multiple-point faults with a distance strictly greater than 2 are considered safe faults.

Note that in the case of transient faults, the safety mechanism will restore the affected item to a fault-free state, even if the driver is never notified of its presence, such faults are considered as detected multiple-point faults. For instance, in the case of protecting memory from transient faults using error-correcting codes, the safety mechanism not only provides the CPU with corrected value, but also repairs the contents of the flipped bits within the memory array (e.g. by writing back the corrected value), thereby returning the affected items to a fault-free state.

2.1.1. Single-Point Fault

A fault of a hardware element which is not prevented by any safety mechanism, and can directly lead to a violation of the safety goals. For example, unmonitored resistors with at least one failure mode (e.g., open circuit) may violate the safety goals.

2.1.2. Residual Fault

A fault in a hardware element that has at least one safety mechanism to prevent from violating the safety goals that can directly lead to safety goal violation. For example, checking a random memory (RAM) block using only the safety mechanism of the checkerboard RAM test may fail to detect certain kinds of bridging faults. Violations of the safety goals due to these faults cannot be covered by the safety mechanisms. Such faults are known as residual faults when the diagnostic coverage of the safety mechanism is less than 100%.

2.1.3. Detected Two-Point Fault

A fault that is detected by the safety mechanism preventing its latent state can only lead to a violation of the safety goals when in conjunction with another independent hardware faults (related to two-point faults). For example, a flash memory fault protected by parity checks can detect a single-bit fault according to the technical safety concept and trigger a response, such as shutting down the system and informing the driver through a warning light.

2.1.4. Perceived Two-Point Fault

A fault that can be perceived by the driver, either is detected or undetected by the safety mechanisms within a specific period, but it can only result in a violation of the safety goals in combination with another independent hardware fault (related to two-point faults). For instance, a two-point fault in which the function is clearly and distinctly affected by the consequences of the fault and can be perceived by the driver.

2.1.5. Latent Two-Point Fault

A fault that is neither detected by the safety mechanism nor perceived by the driver, the system remains operational all the time, without notifying the driver, until a second independent hardware fault occurs.

For example, in flash memory protected by error detection code (EDC), error correction code (ECC) corrects a single-bit permanent fault value during reading, but this correction is not made within the flash memory, nor is there any signal sign. In this case, the fault cannot lead to a violation of the safety goals (since the fault bit has been corrected), but it is neither detectable (due to the lack of signal indication for the single-bit fault) nor imperceptible (since it does not impact the functionality of the application). If an additional fault occurs within the EDC logic, it can lead to the loss of control over the single-bit fault, leading to a potential safety goal violation.

2.1.6. Safe Fault

Safe faults include the following two categories:

a）All n-point faults with n > 2, unless the safety concept indicates that they are relevant factors that violate the safety goal; or

b）faults that do not lead to a violation of the safety goal.

An example is a single-bit fault that is corrected by ECC but not signaled in the case of flash memory protected by ECC and cyclic redundancy check (CRC). The ECC prevents the fault from violating the safety goal, but the ECC does not signal it. If the ECC logic fails, the CRC will be able to detect the fault, and the system will shut down. Only when a single-bit fault exists in the flash memory, the ECC logic fails, and the CRC checksum and monitoring fails, the safety goal will be violated (n=3).

2.2. Failure Modes and Failure Rates of Hardware Elements

2.2.1. Failure Modes of Hardware Elements

According to the fault classification model, the failure modes of hardware elements are categorized as shown in Figure 10.

Figure 11. Flowchart for Failure Model Classification

2.2.2. Hardware Element Failure Mode Classification Process

The failure mode classification process is shown in Figure 11.

And:

λ_SPFis the failure rate associated with single-point faults in hardware elements

λ_RFis the failure rate associated with residual faults in hardware elements

λ_MPFis the failure rate associated with multiple-point faults in hardware elements

λ_S is the failure rate associated with safe faults in hardware elements

The failure rate associated with multiple-point faults in hardware elements, λ_MPF, can be expressed according to Equation (1-1) as follows:

λ_MPF= λ_MPF,DP + λ_MPF,L（1‑1）

where:

λ_MPF，DPis the failure rate associated with multiple-point faults in hardware elements

λ_MPF，Lis the failure rate associated with detected or perceived multiple-point faults in hardware elements

2.3. Hardware Architecture Metrics

Hardware architecture metrics are used to assess the effectiveness of the associated item architecture in coping with random hardware failures.

The goals of hardware architecture metrics are:

Objectively evaluable: The metrics are verifiable and precise enough to distinguish between different architectures;
Support the evaluation of the final design (based on the detailed hardware design with accurate calculations);
Provide pass/fail criteria for hardware architectures based on ASIL levels;
Indicates the adequacy of coverage of safety mechanisms used to prevent the risk of single-point or residual failures in the hardware architecture (SPFM);
Indicates the adequacy of coverage of safety mechanisms used to protect against the risk of latent failures in the hardware architecture (LFM);
Deal with single-point faults, residual faults, and latent faults;
Ensure the robustness of the hardware architecture;
Limited to safety-critical elements only;
Support applications at different element levels, such as assigning target values for vendor hardware elements. For example, target values can be assigned to micro-controllers or ECUs to facilitate distributed development.

2.3.1. Single-Point Fault Metric (SPFM)

The SPFM reflects the robustness of the item to single-point and residual faults through the coverage or design of the safety mechanisms (mainly safe faults). A high SPFM indicates that the proportion of single-point and residual faults in the hardware of the subject item is low.

For hardware designs with safety goals of ASIL (B), C, and D ratings, Equation (1-2) is used to determine the SPFM:

This equation is used to determine the single-point fault metric.

Figure 12. Graphical Representation of Single-Point Fault Metrics (SPFM) - ISO 26262-5:2018(E)

Only safety-related hardware elements of relevant items are considered. Hardware elements for safe faults or n-order multiple-point faults (n>2) may be omitted from the calculation unless they are explicitly related to technical safety concepts. A graphical representation of the SPFM is shown in Figure 12.

2.3.2. Latent Fault Metric (LFM)

The LFM reflects the robustness of the relevant term to latent faults, either by overriding safety mechanisms or by the driver detecting the presence of a fault before a safety goal is violated, or by the design (mainly safe faults). A high LFM implies a low percentage of latent faults in the hardware.

For hardware designs with ASIL (B), (C), and D safety goals, Equation (1-3) is used to determine the LFM:

This equation is used to determine the latent fault metric.

Figure 13. Graphical Representation of Latent Fault Metrics - ISO 26262-5:2018(E) — Figure 13. Graphical Representation of Latent Fault Metrics (LFM) - ISO 26262-5:2018(E)

Only safety-related hardware elements of relevant items are considered. Hardware elements for safe faults or n-order multiple-point faults (n>2) are omitted from the calculation unless explicitly relevant in the technical safety concept. A graphical representation of the LFM is shown in Figure 13.

Table 1. Hardware Architecture Design Metrics and Standard Requirements

2.3.3 Probability of Random Hardware Failure (PMHF) Measurement

As shown in Equation (1-4), the formula for calculating PMHF value is:

PMHFest = λ_SPF + λ_RF+ λ_{DPF_det} × λ_{DPF_latent} × Tlifetime (1‑4)

For each failure mode, calculate its contribution to the total PMHF value as a percentage.

2.4. Hardware Architecture Metrics Target Values

For specific metrics of hardware architecture design, the standard provides corresponding target values (as shown in Table 1), which typically depend on the highest ASIL level that the hardware design needs to meet. For ASIL A levels, the standard does not recommend target values, for ASIL D levels, the standard recommends the most stringent target values, and for some cases of ASIL B and ASIL C, the metrics are recommendations rather than mandatory requirements in the sense of the standard.

Figure 14. Temporal Interference Leading to Cascading Failures - ISO 26262-6:2018(E)

Safety Analyses in the Software Design Phase

1. Safety-Related Functions and Attributes

The derivation of software safety requirements should consider the safety-related functionality and safety-related attributes required by the software. Failures in safety-related functions or safety-related attributes may lead to violations of the technical safety requirements assigned to the software.

Safety-related functions of software typically include:

Functions that enable safe execution of the nominal function
Functions that enable the system to achieve or maintain a safe or degraded state
Functions related to detecting, indicating, and mitigating faults of safety-related hardware elements
Self-testing or monitoring functions related to detecting, indicating, and mitigating failures in the operating system, underlying software, or the application software itself
Functions related to onboard or offboard testing during production, operation, servicing, and end-of-life stages
Functions that allow modification of software during production and service
Functions related to performance or time-critical operations

Safety-related attributes typically include:

Robustness to erroneous inputs
Independence or non-interference between different functions
Fault tolerance of the software, etc.

2. Safety Analyses in the Design Phase of Software Architecture

During the software architecture design phase, corresponding software safety analysis activities should be conducted, which the standard refers to as safety-oriented analysis methods. Safety-oriented analysis is essentially a form of qualitative analysis. Firstly, safety-oriented analysis helps provide evidence that whether the software is suitable to provide the specified safety-related functions and attributes as required by the desired ASIL level. Secondly, safety-oriented analysis helps to identify or validate safety-related software content. Finally, safety-oriented analysis supports the development of safety mechanisms and thus verifies the effectiveness of safety measures.

Relying on the requirement of non-interference or sufficient independence between safety-related elements, the standard recommends performing a Dependent Failure Analysis (DFA). DFA identifies relevant failures or potentially relevant failures and their effects. The goal and scope of the DFA depends on the sub-stage and the level of abstraction at which the analysis is performed. The elements of failure of interest are defined before the analysis is performed (e.g., in the safety plan). This is also a part of the qualitative analysis.

3. Analysis of Safety-Related Failures at the Software Architecture Level

Embedded software provides the ability to specify functions, behaviors, and attributes, as well as the integrity required by the assigned ASIL. By applying safety-related failure analysis at the software architecture level to check or confirm the corresponding safety functions and attributes, along with the integrity of the corresponding assigned ASIL requirements.

3.1. Purpose of Safety Analysis at the Software Architecture Level

During the software architecture design phase, relevant failure analysis should be conducted to identify possible single events, faults or failures that could lead to failure behavior of multiple software elements that require independence (e.g., cascading and/or common cause failures, including common mode failures), and single events, faults or failures that may initiate a chain of causality leading to a violation of a safety requirement that propagates from one software element to another (e.g., cascading failures). Through the failure analysis in the software architecture design phase, the degree of independence or non-interference achieved between relevant software architecture elements should then be tested.

3.2. Software Architecture Level Safety Analysis

Relevant failure analysis at the software architecture level shall consider the following aspects:

Identifying potential design weaknesses, conditions, errors, or failures that could trigger a causal chain leading to violation of safety requirements (e.g., using inductive or deductive methods)
Analyzing the consequences of possible faults, failures, or causal chains on the functions and attributes required of software architecture elements. Analyzing functions and attributes that software architecture elements required and the consequences of possible faults, failures, or causal chains

3.3. Application Scenarios for Relevant Failure Analysis at the Software Architecture Level

The following scenarios may require failure analysis at the software architecture level:

Applying ASIL decomposition at the software level
Implementing software safety requirements, such as providing evidence for the effectiveness of software safety mechanisms, where the independence between monitored elements and monitoring elements must be ensured

4. Deriving Security Mechanisms from Software Architecture Level Safety Analysis

The safety measures include safety mechanisms derived from safety-oriented analyses that cover issues related to random hardware faults and software faults. The results of the safety analyses performed at the software architecture level require the implementation of error detection safety mechanisms and error handling safety mechanisms.

4.1. Error Detection Safety Mechanism

Error detection safety mechanisms include:

Range checks on input and output data
Plausibility checks (e.g., using reference models of the desired behavior, assertion checks, or comparing signals from different sources)
Data error detection (e.g., error detection codes and redundant data storage)
External element monitoring (e.g., ASIC) or another software element that executes the program that performs the watchdog function. The monitoring can be logical, temporal, or both
Temporal monitoring of program execution
Diverse redundancy design
Access violation control mechanisms implemented in software or hardware, used to allow or deny access to safety-related shared resources

The example in Figure 14 illustrates the interference caused by conflicting use of shared resources (e.g., shared processing elements). The QM software element interferes with and prevents timely execution of ASIL software elements (this interference can also occur between software components with different ASIL levels). The upper half of the figure shows software execution without interference mechanisms. By introducing "checkpoints" into the software and implementing timeout monitoring of them, timing perturbations can be detected, enabling proper countermeasures.

4.2. Safety Mechanisms for Error Handling

Safety mechanisms for error handling include:

Deactivation to achieve and maintain a safe state
Static recovery mechanisms (e.g., module recovery, backward recovery, forward recovery, and repeated recovery)
Graceful degradation by prioritizing functions to minimize the negative impact of potential failures on functional safety
Homogeneous redundancy in design, which focuses primarily on controlling the effects of transient or random faults in hardware executing similar software (e.g., temporary redundant execution of software)
Diverse redundancy in design, which involves designing different software in each parallel path and focuses mainly on preventing or controlling systematic faults in the software
Code for data correction
Access rights management implemented in software or hardware to grant or deny access to safety-related shared resources

It is important to note that a review of system-level software security mechanisms (including robustness mechanisms) can be performed to analyze their potential impact on system behavior and their alignment with technical security requirements.

Learn More from Our Experts

Training: Safety Analyses in the Context of ISO 26262 - This one-day training class provides an introduction to the core principles of widely used safety analysis techniques, including FMEA, FMEDA and FTA, and explores their application to the development of safety-related E/E systems in accordance with ISO 26262.

Training: ISO 26262 Awareness Training - Gain essential insights into the world of automotive functional safety with our expert-led ISO 26262 Awareness Training. This course offers a solid foundation in key functional safety terms and concepts and the connection between functional safety, SOTIF, and cybersecurity.

References

International Organization for Standardization. (2018). Road vehicles — Functional safety — (ISO 26262:2018). https://www.iso.org/standard

Get in Touch with Us

This image shows Prof. Dr. Mirko Conrad and Björn Kunze.

Prof. Dr. Mirko Conrad & Björn Kunze

tudoor academy

+49 30 2091646330 email