Search Results | aeSolutions

All (434)

Blog Posts (165)

Other Pages (269)

165 results found with an empty search

Reverend Bayes, Meet Process Safety-Use Bayes’ Theorem to Establish Site Specific Confidence in LOPA
The Process Industry has an established practice of crediting IPLs (Independent Protection Layers) to meet risk reduction targets as part of LOPA (Layer of Protection Analysis) studies. Often the risk targets are calculated to be on the order of 1E-4 per year or lower. Achieving the risk target on paper is one thing, but what is missing from the LOPA calculation is a statement of the confidence in the result. LOPA is an order-of-magnitude method, however, this only reflects the tolerance of error, not the tolerance of uncertainty. It is often stated that LOPA uses generic credits that are conservative, thereby implying the LOPA result should be conservative. By itself this statement is dubious because the generic data used in LOPA did not originate from the facility for which the statistical inferences are being made (which for frequentist-based statistics makes the inference invalid). Worse, when conservative credits are multiplied together to produce a rare-event number, does the conservative property emerge from the combination? There is no way to answer this question without performing IPL Validation (i.e., ensuring the IPL will function when needed). However, IPL Validation and related Safety Life-cycle methods (e.g. functional safety assessments and cyber-security audits related to barrier integrity) are purely qualitative and have no apparent relation to the quantitative risk target. There is a need therefore, to bridge the qualitative results of IPL validation with the quantitative result of the associated LOPA calculation, as a way to establish a site-specific confidence level in the risk target we are trying to achieve. This is where Bayes’ Theorem comes in. Bayes’ Theorem is an epistemological statement of knowledge, versus a statement of proportions and relative frequencies. It is therefore a method that can bridge qualitative knowledge with the rare-event numbers that are intended to represent that knowledge. Bayes’ Theorem is sorely missing from the toolbox of Process Safety practitioners. This paper will introduce Bayes’ Theorem to the reader and discuss the reasons and applications for using Bayes in Process Safety related to IPLs and LOPA. While intended to be introductory (to not discourage potential users), this paper will describe simple Excel based Bayesian calculations that the practitioner can begin to use immediately to address issues such as uncertainty, establishing confidence intervals, properly evaluating LOPA gaps, and incorporating site specific data, all related to IPLs and barriers used to meet LOPA targets. Click here to view the complete whitepaper
Methodologies in Reducing Systematic Failures of Wired IPLs
by Richard E. Hanner & Tab Vestal The history of high consequence incidents in industry reveals that most accidents were the result of systematic failures, not hardware failures. However, a higher degree of focus in engineering is often on the quantifiable failures of hardware. Process Safety risk gaps are often closed or reduced by several types of Independent Protective Layers (IPLs). Two common types are Safety Instrumented Functions (SIFs) and Basic Process Control System (BPCS) functions. The SIFs typically reside within a SIL-rated programmable logic controller, and their achieved quantitative performance is calculated based on random hardware failures of the SIF hardware components. Conversely, BPCS protective layers are assigned generic industry-accepted probability of failure credits. The BPCS generic industry-accepted probabilities of failure are conservatively assigned and consider unquantifiable human-induced systematic failures. In either case, the likelihood of systematic failures can be reduced by recognizing design, specification, maintenance, and operations activities that are potential sources, and applying measures to prevent or reduce them. By reducing systematic failures, you reduce the risk in the industrial process and increase confidence in meeting the intended integrity requirements. This technical paper will discuss the common sources of systematic failures and preventative or mitigative measures to prevent their occurrence. Topics Included in Whitepaper: Systematic failure , random hardware failure , Independent Protective Layer, IPL, SIF, SIS, BPCS , common cause, Human Factor Analysis , SIL Verification Click here to view the complete whitepaper
Lessons Learned on SIL Verification and SIS Conceptual Design
by Richard E. Hanner & aeSolutions Technical Team There are many critical activities and decisions that take place prior to and during the Safety Integrity Level (SIL) Verification and other Conceptual Design phases of projects conforming to ISA84 & ISA/IEC 61511. These activities and decisions introduce either opportunities to optimize, or obstacles that impede project flow, depending when and how these decisions are managed. Implementing Safety Instrumented System (SIS) projects that support the long‐term viability of the Process Safety Lifecycle requires that SIS Engineering is in itself an engineering discipline that receives from, and feeds to, other engineering disciplines. This paper will examine lessons learned within the SIS Engineering discipline and between engineering disciplines that help or hinder SIS project execution in achieving the long‐term viability of the Safety Lifecycle. Avoiding these pitfalls can allow your projects to achieve the intended risk reduction and conformance to the ISA/IEC 61511 Safety Lifecycle, while avoiding the costs and delays of late‐stage design changes. Alternate execution strategies will be explored, as well as the risks of moving forward when limited information is available. Click here to view the complete whitepaper Topics Include: IEC 61511, ISA/IEC 61511 , Safety Instrumented Systems (SIS) , Independent Protection Layers (IPL) , Functional Safety Assessment (FSA) , Safety Requirement Specification (SRS) , Safety Lifecycle , Functional Safety Management Plan (FSMP ), Project Execution Plan (PEP), SIS Front‐End Loading (SIS FEL), Layer of Protection Analysis (LOPA ), SIL Verification
FGS 1400 MK II - Evolution of the traditional Fire panel
by Warren Johnson, PE, PMP In 2005, aeSolutions recognized an industry need for Fire and Gas panels based on a SIL capable PLC safety control platform. Large industrial clients were looking for a system capable of monitoring and controlling Fire system 1/0, combustible gas, toxic gas, and oxygen depletion detectors, initiating suppression release, controlling HV AC, and performing process safety shutdowns. To develop the Fire and Gas system requirements needed by industry, we first needed to understand the regulatory requirements, applicable industry standards, and the types of fire and gas systems currently in use .. Here are some of the key regulatory requirements mandated by OSHA. - OSHA 1910.155 Fire Detection- 3rd party approval by Nationally recognized laboratory - OSHA 1910.164 Fire Detection Systems - Circuit Supervision - OSHA 1910.165 Employee Alarm Systems - Circuit supervision - Power Supply Monitoring Other key drivers are determining which industry standards are applicable. Are the standards mandatory? Many local and state codes reference the International building code. This code requires the use of NFPA 72 for fire alarm signaling systems. The authority having jurisdiction (AHJ) in each jurisdiction has the final authority in determining the applicable standards that the fire alarm system must meet. Click here to view the complete whitepaper
Improving the Safety Instrumented System (SIS) Design Process with Graphic Diagrams
by Keith A. Brumbaugh, PE During a Safety Instrumented System (SIS) implementation project at a plant site new to the ANSI/ ISA 84 process safety lifecycle world, we discovered the importance of utilizing graphic diagrams in the development of SIS ‐related documentation to support the on‐site team meetings and document decisions. In a room full of plant operators and engineers accustomed to working “hands on” in the field, it was often far easier to keep the team on track when they were provided with a drawing to discuss, as opposed to having the team look at a screen full of text. The graphic diagrams also provided the design team with equal benefits as we received greater focused team member feedback, allowing for more efficient and thorough updates to documentation. This method of capturing team member input also enabled concise integration of the team input into various SIS‐related documents during and after the meetings. Examples of these graphic diagrams included the following: - A logic solver block diagram ‐ used to quickly identify which Logic Solver Safety PLCs, Independent Protection Layers (IPLs), Logic Narratives, and Equipment were related to each other. - Logic flow diagrams for heaters and boilers ‐ used to visualize the order in which light off permissive would be met, which statuses would cause a partial or complete trip, and related IPLs. - SIF Diagrams ‐ used to depict complex SIF architecture to keep track of how a SIF would function. The author will present examples of the different types of graphic diagrams, methods in which the diagrams were utilized, and the benefits that each provided in the implementation of certain phases of an ANSI/ ISA 84 SIS lifecycle project. These diagrams were considered to be valuable process safety information and part of the final SIS Front End Loading design. Click here to view the complete whitepaper
Using the STAMP Systems-Based Approach to Identify Hazards for the Transient Operating State
STAMP ( Systems Theoretic Accident Model and Processes ) is a relatively new accident causality model based on systems theory. It draws its main tenets from systems thinking that (1) accidents can happen even when there has been no failure, (2) that interactions between components of the system create emergent properties that can lead to failure, and (3) it treats accidents as a control problem rather than a failure problem. STPA (Systems Theoretic Process Analysis) or colloquially “Stuff That Prevents Accidents” is a powerful hazard analysis technique based on STAMP. The STPA technique is based on a control structure rather than a traditional hardware-based structure as typically shown on a P&ID (Piping & Instrumentation Diagram). STPA is not so concerned with identifying component failures, but rather how those components interact and what controls or constraints are placed on the interactions that can lead to hazards. The STPA technique is a good fit for identifying the ways hazards can arise during transient operating states such as maintenance, start-up, or response to abnormal situation. It identifies unsafe or missing controls related to the transient mode needed to prevent an accident. It works off of a control structure of the transient mode versus procedures or P&IDs. A typical control structure can include components, humans, software, requirements, expectations (written and unwritten). Traditional PHA (Process Hazards Analysis) methods such as HAZOP or What-if will not provide the same perspective. This paper will provide two examples of transient mode control structures, one for maintenance and one for response to abnormal situation, and show how to perform the STPA hazard analysis on those control structures to ensure the proper controls and constraints are identified to prevent an unwanted event. This paper was originally presented at the 2022 AIChE Spring Meeting & 18th Global Congress on Process Safety. Click here to view the complete whitepaper CHAZOP : Controls Hazard Operability Study
Decoding SIS: Are You Doing What’s Necessary to Prevent Disasters?
By Emily Henry, PE(SC), CFSE & aeSolutions Technical Team When your facility is tasked with industry safety standard compliance, where do you start? What do all those SIS acronyms mean? For OSHA PSM-covered facilities, adherence to a functional safety lifecycle can be a critical step in overall SIS performance assurance. What is hiding under the radar of a plant SIS? Risk assessments define hazard consequences with assumed initiating event frequencies. How do we prevent these consequences? By verifying the reliability and availability assumptions of SIL Verification design parameters. Without understanding the design parameters your SIS is based upon, or without proper maintenance of your SIS equipment, your risk assessment gap closure may be incomplete. What factors into the assumptions of an SIS design? Are your safety devices replaced at their specified asset life, tested at the interval, and tested with the necessary rigor to uncover dangerous failures as specified in your calculations? What does following the Functional Safety Lifecycle entail? Does your facility have a Functional Safety Management Plan, perform Functional Safety Assessments on your SIS Design, and keep records of device failures to evaluate field performance against assumed reliability? This paper illustrates the real consequences of failing to uphold SIS design assumptions or follow the Functional Safety Lifecycle. Click here to view the complete whitepaper Prepared for Presentation at American Institute of Chemical Engineers 2024 Spring Meeting and 20th Global Congress on Process Safety New Orleans, LA March 24-28, 2024
Does Your Facility Have the Flu? Use Bayes Rule to Treat the Problem Instead of the Symptom
Is our industry addressing the problems facing it today? We idealize infinitesimally small event rates for highly catastrophic hazards, yet are we any safer? Have we solved the world’s problems? Layers of protection analysis (LOPA) drives hazardous event rates to 10-4 per year or less, yet industry is still experiencing several disastrous events per year. If one estimates 3,000 operating units worldwide and industry experiences approximately 3 major incidents per year, the true industry accident rate is a staggering 3 / 3,000 per year (i.e. 10-3). All the while our LOPA calculations are assuring us we have achieved an event rate of 10-6. Something is not adding up! Rather than fussing over an unobtainable numbers game; wouldn’t it be wiser to address protection layers which are operating below requirements? We are (hopefully) performing audits and assessments on our protection layers and generating findings. Why are we not focusing our efforts on the results of these findings? Instead we demand more bandages (protect layers) for amputated limbs (LOPA scenarios) instead of upgrading those bandages to tourniquets. Perhaps the dilemma is we cannot effectively prioritize our corrective actions based on findings. Likely we have too much information and the real problems are lost in the chaos. What if there was a way to decipher the information overload and visualize the impact of our short comings? Enter Bayes rule to provide a means to visualize findings through a protection layer health meter approach; to prioritize action items and staunch the bleeding. by Keith Brumbaugh Topics include: Bayes, Bayes rule, Bayes theory, LOPA, IPL, SIS, SIF, SIL Calculations, systematic failure, human factors, human reliability, operations, maintenance, IEC 61511, ANSI/ISA 61511, hardware reliability, proven in use, confidence interval, credible range, safety lifecycle , functional safety assessment , FSA stage 4, health meter. Click here to view the complete whitepaper
Whitepaper: Achieving 84-92% Urgent Alarm Reduction Through Comprehensive Lifecycle Implementation: A Dual-Unit Midstream Case Study
Awarded Best Paper Award at the 2025 TEES Mary Kay O'Connor Process Safety Center-TAMU (MKO) Safety & Risk Conference Abstract November 2025 — Greg Pajak, aeSolutions Senior Specialist, ICA — A midstream facility implemented a systematic alarm rationalization program across two critical units, achieving unprecedented reductions in urgent alarm loads. Unit A reduced urgent alarms from 45% to 7% (84% reduction), while Unit B decreased from 62% to 5% (92% reduction). This paper presents the methodology, implementation approach, and quantified results of applying the ANSI/ISA-18.2-2016 alarm management lifecycle in a brownfield LNG facility. The comprehensive approach integrated automation, process safety, and operations perspectives, resulting in significant improvements in operator effectiveness and process safety performance. Cross-functional teams utilized the Maximum Severity Method for consistent, risk-based prioritization across 48,156 potential alarm points in Unit A and 7,009 points in Unit B. The project eliminated over 5,900 nuisance urgent alarms in Unit A and 1,960 in Unit B, transforming alarm systems from sources of operator overload into effective tools for abnormal situation management. Results demonstrate that properly implemented alarm management programs can achieve transformational improvements in operational safety and efficiency, providing a replicable model for the LNG industry. 1. Introduction The liquefied natural gas (LNG) industry faces unique operational challenges due to cryogenic processes, flammable materials, and complex interdependencies between process units. Effective alarm management becomes critical for maintaining safe operations while preventing operator overload during abnormal situations. Despite widespread recognition of alarm management importance following major incidents like Texas City (2005) and Buncefield (2005), many facilities struggle to fully implement comprehensive alarm management lifecycles. This Facility recognized that partial alarm management efforts yield limited benefits and committed to systematic implementation of the complete ANSI/ISA-18.2-2016 lifecycle. As a brownfield site with existing legacy systems, the facility faced additional challenges requiring thorough re-evaluation of alarm configurations across multiple platforms including Honeywell Experion DCS, SCADA systems, and Safety Manager. This paper presents results from two major alarm rationalization projects: Unit A and Unit B The scope encompassed all facility alarms interacting with normal process operations, excluding only fire and gas system alarms addressed separately. The rationalization effort aimed to ensure each alarm met the fundamental definition: "An audible and/or visible means of indicating to the operator an equipment malfunction, process deviation, or abnormal condition requiring a response." 2. Background and Literature Review 2.1 Alarm Management Standards Evolution The process industries have developed comprehensive standards for alarm management, with ANSI/ISA-18.2-2016 and IEC 62682:2022 representing current best practices. These standards define a complete lifecycle approach encompassing ten stages: Philosophy, Identification, Rationalization, Detailed Design, Implementation, Operation, Maintenance, Monitoring and Assessment, Management of Change, and Audit. Research demonstrates that facilities implementing partial lifecycle elements achieve limited improvements, while comprehensive implementation yields transformational results. The Abnormal Situation Management (ASM) Consortium estimates that poor alarm management contributes to $20 billion annually in lost production and incidents across the process industries. 2.2 LNG Industry Specific Challenges LNG facilities present unique alarm management challenges due to: Cryogenic temperature operations requiring precise control Vapor management systems with rapid dynamics Integration between liquefaction, storage, and regasification Stringent environmental compliance requirements Post-incident regulatory scrutiny These factors necessitate alarm systems that support rapid, accurate operator response while minimizing cognitive load during upset conditions. 2.3 Quantifying Alarm Management Performance Industry benchmarks established by the Engineering Equipment and Materials Users Association (EEMUA) Publication 191 define acceptable alarm system performance metrics: Average alarm rate: <1 alarm per 10 minutes Peak alarm rate: <10 alarms per 10 minutes Alarm priority distribution: ~80% Low, ~15% Medium, ~5% High However, many facilities operate far outside these guidelines, with urgent/critical alarms often comprising 30-60% of total alarm load, creating conditions where operators cannot effectively respond to genuine process upsets. 3. Methodology 3.1 Project Scope and Timeline The alarm rationalization encompassed two major operational units: Unit A : Conducted January 29 - March 26, 2024 Unit B: Conducted March 11-15, 2024 Both projects utilized hybrid in-person and remote participation via Webex to accommodate team members across multiple locations. 3.2 Team Composition Cross-functional teams included: Process Controls Engineering Process Engineering Operations personnel Operations Management Third-party facilitators (Applied Engineering Solutions) experienced in alarm rationalization methodology This diverse composition ensured comprehensive evaluation incorporating technical design, operational experience, and process safety perspectives. 3.3 Rationalization Methodology The team employed a knowledge-based Maximum Severity Method for alarm prioritization. This approach evaluates each alarm against multiple consequence categories: Table 1: Severity Level Matrix Severity Level Safety/Environmental Economic Impact Equipment Damage Catastrophic Fatality/Major Environmental Release >$10M Total Loss Severe Lost Time Injury/Reportable Release $1M-$10M Major Damage Moderate Medical Treatment/Minor Release $100K-$1M Significant Repair Minor First Aid/No Release <$100K Minor Repair The highest severity across all categories determines final alarm priority, ensuring conservative risk assessment. 3.4 Documentation and Analysis Tools The rationalization process utilized: Existing Honeywell Experion alarm database exports Current Piping and Instrumentation Diagrams (P&IDs) aeAlarm software (Sphera PHA-Pro® based) for systematic documentation Historical alarm activation data to validate setpoints Each credible alarm was documented with: Purpose and process deviation addressed Consequence of no operator action Required operator response Time available for response Priority assignment rationale 3.5 Alarm Qualification Criteria Alarms were evaluated against the site's Alarm Management criteria: Does the condition require operator action? Is the operator the primary respondent? Is there sufficient time for operator response? Will the operator know what action to take? Can the operator take the required action? Points failing these criteria were reclassified as events, journals, or removed entirely. 4. Results and Discussion 4.1 Unit A Alarm Reduction Results This rationalization achieved dramatic improvements in alarm system performance: Table 2: Unit A: Alarm Distribution - Before and After Rationalization Priority Pre-Rationalization Post-Rationalization Reduction Urgent 6,473 45% 571 7% 91.2% High 541 4% 405 5% 25.1% Low 7,259 51% 6,674 87% 8.1% Total 14,273 100% 7,650 100% 46.4% The 91.2% reduction in urgent alarms represents elimination of 5,902 nuisance or improperly classified alarms that previously competed for operator attention during critical situations. Figure 1: Unit A Alarm Priority Distribution Transformation 4.2 Unit B Results Unit B demonstrated even more dramatic improvements: Table 3: Unit B Alarm Distribution - Before and After Rationalization Priority Pre-Rationalization Post-Rationalization Reduction Urgent 2,036 62% 76 5% 96.3% High 377 12% 202 14% 46.4% Low 853 26% 1,164 81% -36.5%* Total 3,266 100% 1,442 100% 55.8% *Low priority alarms increased as urgent alarms were properly reclassified The 96.3% reduction in urgent alarms eliminated 1,960 improperly configured alarms, dramatically improving the signal-to-noise ratio for genuine process upsets. Figure 2: Unit B Alarm Priority Distribution Transformation 4.3 Systematic Improvements Identified The rationalization process identified 129 total action items across both units: UNIT A: 58 action items UNIT B: 71 action items Common improvement categories included: Elimination of redundant alarms on single process deviations Proper configuration of alarm deadbands and delay timers Reclassification of informational points to events/journals Integration of alarm response procedures with operator training Correction of alarm priority inversions 4.4 Operational Impact Assessment The rationalized alarm system has fundamentally transformed the operating environment at this facility. While specific quantitative metrics are proprietary, the qualitative improvements in operational performance have been significant. The dramatic reduction in alarm load, particularly in the urgent category, has created a calmer, more focused control room environment where operators can effectively manage the process rather than simply reacting to constant alarms. Compliance and Documentation Benefits 100% of remaining alarms now have documented response procedures Full traceability established for regulatory audits Alarm system performance now aligns with EEMUA 191 guidelines Complete audit trail maintained through aeAlarm documentation 5. Implementation Lessons and Best Practices 5.1 Critical Success Factors 1. Executive Sponsorship and Resource Commitment Full lifecycle implementation requires significant time investment from operations and engineering personnel. Executive support ensured adequate resource allocation and schedule priority. 2. Operator Engagement Throughout Process Including experienced operators in every rationalization session captured critical institutional knowledge and ensured practical response procedures. 3. Systematic Methodology Application Consistent application of the Maximum Severity Method prevented subjective priority assignment and ensured conservative risk assessment. 4. Integration with Existing PSM Systems Linking alarm rationalization with Management of Change, PHA revalidation, and operator training programs embedded improvements in operational practice. 5.2 Common Challenges and Solutions Challenge 1: Securing Adequate Time from Key Personnel Solution : The primary challenge was obtaining large blocks of time from busy operational staff. The project succeeded by using flexible scheduling, breaking sessions into manageable durations, and emphasizing the long-term operational benefits of participation. Challenge 2: Resistance to Removing "Historical" Alarms Solution : Data-driven demonstration of alarm flooding impact during actual events convinced stakeholders to eliminate non-critical alarms. The involvement of extremely knowledgeable staff who understood both process and operations proved invaluable in making these decisions smoothly. Challenge 3: Data Consistency Across Systems Solution : Careful verification processes ensured alignment between disparate PLC systems and the master alarm database, preventing loss or duplication of critical alarm information. 5.3 Technology and Tool Considerations The aeAlarm rationalization tool proved essential for: Maintaining consistency across multiple sessions Tracking action items and implementation status Generating operator response documentation Supporting regulatory audit requirements Integration with existing Honeywell Experion systems required careful configuration management to preserve rationalization decisions during system updates. 6. Industry Applications and Recommendations 6.1 Scalability to Other LNG Facilities The methodology demonstrated here scales effectively to other facilities by: Adapting severity matrices to site-specific risk tolerances Adjusting team composition based on organizational structure Phasing implementation based on unit criticality Leveraging common control system platforms 6.2 Recommended Implementation Approach Based on our experience, optimal implementation follows this sequence: Phase 1: Foundation (Months 1-2) Develop site-specific alarm philosophy Establish performance baselines Form cross-functional team Select rationalization tools Phase 2: Pilot Implementation (Months 3-4) Select representative unit/system Complete full rationalization cycle Validate methodology and tools Refine procedures based on lessons learned Phase 3: Full Deployment (Months 5-12) Systematically address remaining units Implement approved changes Train operators on new alarm schemes Establish monitoring systems Phase 4: Sustainment (Ongoing) Monthly performance reviews Quarterly alarm health assessments Annual philosophy updates Continuous improvement initiatives 6.2 Return on Investment Considerations While specific project costs are proprietary, the business case for alarm rationalization is compelling. The investment in this project is minor compared to the potential costs of: Operator hours spent managing nuisance alarms Extended troubleshooting time during process upsets Potential incidents resulting from operator overload Regulatory penalties for non-compliance with RAGAGEP Industry benchmarks demonstrate typical returns including: Reduced operator errors through improved situational awareness Decreased unplanned downtime from better upset management Lower incident investigation costs Invaluable improvement in regulatory compliance position 7. Conclusions This alarm rationalization project demonstrates that systematic implementation of the ANSI/ISA-18.2-2016 lifecycle can achieve transformational improvements in alarm system performance. The 84-92% reductions in urgent alarm loads across two major units significantly exceed typical industry achievements, validating the comprehensive approach. Key conclusions from this implementation: Full lifecycle implementation is essential - Partial efforts yield marginal benefits while comprehensive programs achieve step-change improvements. Cross-functional engagement drives success - Integration of operations, engineering, and process safety perspectives ensures practical, sustainable solutions. Quantified baselines enable continuous improvement - Detailed before/after metrics demonstrate value and guide ongoing optimization. Brownfield challenges are surmountable - Legacy systems can be successfully rationalized with proper methodology and commitment. Operator effectiveness improvements justify investment - Enhanced situational awareness and response capability directly improve process safety performance. The dramatic reductions achieved here establish new benchmarks for alarm management excellence in the Midstream industry. As facilities face increasing operational complexity and regulatory scrutiny, comprehensive alarm rationalization becomes not just best practice but operational necessity. 8. Future Work Building on current achievements, future initiatives include: Advanced Alarm Management Techniques Implementation of state-based alarming for startup/shutdown Dynamic alarm suppression during known process transitions Predictive analytics for alarm flood prevention Integration with Digital Transformation Incorporation of machine learning for nuisance alarm identification Real-time alarm performance dashboards Mobile operator notification systems Industry Collaboration Development of LNG-specific alarm management guidelines Benchmarking studies across multiple facilities Knowledge sharing through industry forums Continuous Improvement Metrics Correlation of alarm performance with safety incidents Operator workload quantification studies Economic impact validation The success achieved through systematic alarm rationalization provides a foundation for continued advancement in operational excellence and process safety performance. References ANSI/ISA-18.2-2016, Management of Alarm Systems for the Process Industries, International Society of Automation, Research Triangle Park, NC. IEC 62682:2022, Management of alarm systems for the process industries, International Electrotechnical Commission, Geneva, Switzerland. EEMUA Publication 191, Alarm Systems - A Guide to Design, Management and Procurement, 3rd Edition, Engineering Equipment and Materials Users Association, London, UK, 2013. Rothenberg, D.H., "Alarm Management for Process Control: A Best-Practice Guide for Design, Implementation, and Use of Industrial Alarm Systems," Momentum Press, New York, 2018. Hollifield, B., and Habibi, E., "The Alarm Management Handbook: A Comprehensive Guide," PAS, Houston, TX, 2011. U.S. Chemical Safety and Hazard Investigation Board, "Investigation Report: Refinery Explosion and Fire," Report No. 2005-04-I-TX, Washington, DC, 2007. Health and Safety Executive, "The Buncefield Incident 11 December 2005: The final report of the Major Incident Investigation Board," Bootle, UK, 2008. Abnormal Situation Management Consortium, "Effective Alarm Management Practices," Honeywell Process Solutions, Phoenix, AZ, 2019. Center for Chemical Process Safety, "Guidelines for Safe Automation of Chemical Processes," 2nd Edition, AIChE, New York, 2017. Stauffer, T., and Sands, N.P., "Alarm Management and ISA-18.2: Management of Alarm Systems for the Process Industries," ISA Automation Week Proceedings, 2014. Acknowledgments The authors acknowledge the dedication of operations and engineering personnel who committed extensive time to the rationalization process. Special recognition goes to Applied Engineering Solutions for their expert facilitation and the operations teams who provided invaluable institutional knowledge. This project's success reflects the organization's commitment to operational excellence and process safety leadership.
Designing Operator Tasks to Minimize the Impact of Heuristics and Biases
Often times when a person is blamed for “not thinking,” the reality is they were thinking, but were not aware of it. This is the theory of System 1 (i.e., Fast) versus System 2 (i.e., Slow) thinking that explains we are really two people: Our conscious aware selves (System 2 thinking), and a dominant “fast” subconscious making most of our decisions (System 1 thinking) without being consciously aware of it in the moment (to the point that some have argued there is no such thing as “free will”). The heuristics (i.e., mental short cuts) we use to think in System 1 are necessary to make it through a day (it is exhausting to maintain a continuous conscious stream of thought), and often lead to good outcomes. However, System 1 thinking can make us vulnerable to systematic biases (i.e., mental traps) that arise from the use of those heuristics. It is necessary to be aware of the traps System 1 thinking can create, because often times that is our only defense against them. In this respect, “fast thinking” represents one of the fundamental limits to achieving safe operation. In addition to awareness, there is a need where possible to design operator tasks and the interfaces they use to minimize the likelihood of systematic bias occurring when thinking in System 1. Lastly, it would be useful to provide designs that could increase the potential for the operator to engage System 2 thinking (consciousness) when required, which is less susceptible to biases. This paper proposes a combined approach of discussing the cognitive psychology behind System 1 and System 2 thinking, the types of heuristics we use, the biases that result, and operator task and interface design that can minimize the likelihood of systematic bias. The paper will incorporate the learnings from 5 years of safety critical Task Analysis performed for field and control room tasks. A practical operator response to abnormal situation model will be described that will link the heuristics used and potential biases that may occur, as well as design features to minimize the likelihood of those occurring. As presented at the 2020 AIChE Spring Meeting & 16th Global Congress on Process Safety. Click here to view the complete whitepaper Process Safety Services
How Taking Credit for Planned and Unplanned Shutdowns Can Help You Achieve Your SIL Targets
by Keith Brumbaugh , P.E., CFSE Achieving Safety Integrity Level (SIL) targets can be difficult when proof test intervals approach turnaround intervals of five years or more. However, some process units have planned and predictable unplanned shutdowns multiple times a year. During these shutdowns, it may be possible to document that the safety devices functioned properly. This can be incorporated into SIL verification calculations to show that performance targets can now be met without incorporating expensive fault tolerance , online testing schemes, etc. This can result in considerable cost savings for an operating unit. The problem If a process plant is following the ANSI/ ISA 84.00.0 1 process safety lifecycle (i.e. ISA 84) or similar, as part of the allocation of safety functions to protection layers phase, a SIL assessment (e.g., a Layers of Protection Analysis (LOPA)) would be undertaken to assign Safety Integrity Levels (SIL) targets to a Safety Instrumented Function (SIF) . A scenario could occur in the design and engineering phase of the ISA 84 safety lifecycle when performing the SIL verification calculations, that the team discovers the SIFs do not meet their performance target. Assuming the calculation was done properly using valid data and assumptions, something would need to change in order to meet or exceed the required performance targets. This issue could occur in a Greenfield plant when first designing a SIF, but is more likely to be discovered during a revalidation cycle of a brownfield plant. Click here to view the complete whitepaper
Can we achieve Safety Integrity Level 3 (SIL 3) without analyzing Human Factors?
by Keith Brumbaugh P.E Many operating units have a common reliability factor which is being overlooked or ignored during the design, engineering, and operation of high integrity Safety Instrumented Functions (SIFs) . That is the Human Reliability Factor. In industry, there is an over focus on hardware reliability to the n’th decimal point when evaluating high integrity SIFs (such as SIL 3), all to the detriment of the human factors that could also affect the Independent Protection Layer (IPL) . Most major accident hazards arise from human failure, not failure of hardware. If all that were needed to prevent process safety incidents is to improve hardware reliability of IPLs to some threshold, the frequency of near miss and actual incidents should have tailed off long ago - but it hasn’t. Evaluating the human impact on a Safety Instrumented Function requires performing a Human Factors Analysis . Human performance does not conform to standard methods of statistical uncertainty, but Human Reliability as a science has established quantitative limits of human performance. How do these limits affect what we can reasonably achieve with our high integrity SIFs? What is the uncertainty impacts introduced to our IPLs if we ignore these realities? This paper will examine how we can incorporate quantitative Human Factors into a SIL analysis. Representative operating units at various stages of maturity in human factors analysis and the I EC/ ISA 61511 Safety Lifecycle will be examined. The authors will also share a checklist of the human factor considerations that should be taken into account when designing a SIF or writing a Functional Test Plan. Click here to view the complete whitepaper