A Look at Root Cause Analysis
Kenneth St Brice, the Asset Integrity Manager at Rashpetco, provides some insight into root cause analysis.
Please give some background about your company. What is your role, and what does it entail?
BG Group is an integrated energy company with particular focus on natural gas. BG Group is involved across the gas chain from exploration and production, transmission and distribution, liquefaction and power. BG Group currently operates in 25 countries internationally.
Rashpetco is a joint venture between the Egyptian Natural Gas Holding Company (EGAS), BG and other joint venture partners. Rashpetco’s operations are located in the Mediterranean waters off Egyptian North Coast. There are two main concessions under development, namely, Rosetta and West Delta Deep Marine Concessions. The facility infrastructure comprises of offshore production via platforms and subsea wells, which produce into a collection network and onshore gas terminal facilities.
My current role is the Asset Integrity Manager within Rashpetco. The role involves primarily development and implementation of an Asset Integrity Management program within Rashpetco. Key aspects of the job entail:
- Development of a Management Framework for Asset Integrity
- Implementation of a Performance Management system for Asset Integrity
- Identifying opportunities for continuous improvement in the facility AIM program
- Providing integrity assurance input to brownfield development projects.
Describe root cause analysis (RCA).
Root Cause Analysis (RCA) really refers to the process by which the key underlying cause of an unwanted event (accident, incident) is identified. The process of arriving at the root cause is structured and assesses several optional candidates for the root cause while driving towards the final solution. Multiple tools or methodologies exist for Root Cause identification. These at times may have a specialized applications and it is important to know the particular strengths of each particular tool.
It is important to recognize that the Root Cause is not the same as the Immediate Cause of a failure and many times does not coincide. The Immediate Cause though is always a contributor to establishing the Root Cause.
How can RCA prevent recurring failures?
If the lessons arising out of the Root Cause Analysis are taken "on board" they have the potential to prevent recurrence. This is indeed the main reason for implementing a Root Cause Analysis. It is also typical that the identified root cause may be common to several other potential failures. Thus addressing the recommended actions from the RCA may help not only to eliminate recurrence of the specific failure but potentially other failures which had a common root as the one that occurred.
It is worth noting that, in practice, at times it is impossible to identify with 100 percent certainty, a single root cause of an event. In such circumstances, it is prudent to work with the most likely causes and mitigate the risk of recurrence by addressing all of the potential causes of the event.
What are failure causes, and why is it important to analyze them?
Failure causes refers to the specific technical deficiency which is responsible for failure of equipment (or component). The failure cause may lie in the design, operation or the process itself.
Like RCA, identifying the failure cause is critical to avoiding future or similar problems as well as helping in the determination of a root cause. While failure causes may be the immediate cause of an incident, the root cause at times may lie in deeper, more systemic issues from which the immediate or failure cause arose.
An example of this would be the failure cause being the design of a specific equipment component leading to mechanical failure of the component. The root cause of the incident may lie in the quality management process that led to the erroneous design being accepted. Resolving this deeper issue may safeguard not only the repeat of the specific component deficiency but potentially the design other components that may be coming out of a flawed process.
Please provide some best practices to implement an Asset Integrity Program.
Asset Integrity can be seen as fundamentally concerned with safeguarding an operating facility against the threats which may lead to a Major Hazard Event. The parts of the plant whose failure may result in such an event or inhibit the ability to cope with such an event are identified as the Safety Critical Elements. As such, any sound Asset Integrity Program is primarily concerned with ensuring robust management of Safety Critical Elements.
In order to do this, several key structures need to be in place. Among these are:
- Accurate identification of Safety Critical Elements. This is not always a simple process, however if properly done, it facilitates improved focus on the aspects of the plant which are necessary for safeguarding the integrity of the installation.
- Clear ownership needs to be in place for Safety Critical systems.
- A Performance Management program is key to understanding the effectiveness of the Asset Integrity Management program. This would involve a combination of elements including Key Performance Indicators, verification of the performance of Safety Critical Elements, etc. and in general a process that allows deficiencies to be identified and followed up for closure.
In order to achieve these objectives, there are several underlying structures that will support the success of the program such as:
- Strong management support
- Robust planning function
- Clear communication process
- Management processes for maintenance of quality
What are some ways to improve safety conditions?
Improvement of safety conditions is based upon organizational "buy in" to a set of safety values. For this to be realized there are several stages that must be entered upon:
- The first phase is the awareness building. This involves the organization making it clear via a range of appropriate media what its safety expectations and values are.
- The next stage of the process is reinforcement of these values as the culture is developed. This should comprise of programs that positively recognize improvements while at the same time identifying gaps in the progress towards the targets.
- The ongoing process would comprise of a performance management process that seek to record progress against set benchmarks and determine corrective actions required to improve performance.
How can we improve inspection and monitoring to protect oil and gas and petrochemical assets?
The key to application of a sound inspection and monitoring program is in understanding of the degradation mechanism of the component being monitored and inspected. It is oftentimes the case that programs of inspection and monitoring are based upon historical practice rather than specific review of the particular circumstance and application of the most appropriate methods of monitoring.
This is not always easy to apply as at times the degradation mechanisms in aspects of the process may be unknown. In this case the best approach would be to take the benefit of the history derived from other operators of similar processes. Alternatively where this is unavailable, taking a conservative approach to the inspection/monitoring process by utilization of multiple technologies until such time as an appropriate history of the process is obtained and the most appropriate approach adopted. Knowledge of the best available technologies, their benefits and limitations is necessary to support this approach.
Application of risk-based methods as a whole allows for the optimal use of resources and more efficient capture of relevant information on the equipment degradation.
The entire process should be periodically reviewed to confirm that it is still appropriate or to identify where further improvements can be made.
Is there anything you would like to add?
While Asset Integrity implementation is based upon a number of technical objectives being accomplished, it is critical to understand that the key cornerstone to success in any Asset Integrity Management program is the human element. The human aspect of integrity assurance, also known as the "soft barriers" comprise of such areas as competency, training, supervision, procedure implementation, communication, etc. If not well developed, these factors may potentially undermine the assurance built in other areas of integrity development.
As a consequence, significant focus has to be placed on understanding the challenges of human behavior and working to ensuring that human reliability is a part of the success equation for the program development.
Interview by Jessica Livingston