The use of Bayesian Networks in Functional Safety

Functional safety engineers follow the ISA/IEC 61511 standard and perform calculations based on random hardware failures. These result in very low failure probabilities, which are then combined with similarly low failure probabilities for other safety layers, to show that the overall probability of an accident is extremely low (e.g., 1E-5/yr). Unfortunately, such numbers are based on frequentist assumptions and cannot be proven. Looking at actual accidents caused by control and safety system failures shows that accidents are not caused by random hardware failures. Accidents are typically the result of steady and slow normalization of deviation (a.k.a. drift). It’s up to management to control these factors. However, Bayes theorem can be used to update our prior belief (the initial calculated failure probability) based on observing other evidence (e.g., the effectiveness of the facility’s process safety management process).

The results can be dramatic. For example, assuming a safety instrumented function with a risk reduction factor of 5,000 (i.e., SIL 3 performance), and a process safety management program with a 99% effectiveness, results in the function actually having a risk reduction factor of just 98 (i.e., essentially the borderline between SIL1 and SIL 2). The key takeaway is that the focus of functional safety should be on effectively following all the steps in the ISA/IEC 61511 safety lifecycle and the requirements of the OSHA PSM regulation, not the math or certification of devices. Both documents were essentially written in blood through lessons learned the hard way by many organizations.

To learn more about the use of Bayesian networks in functional safety, read the full paper here.