As IT leaders, we invest in people, processes, and technology to prevent cybersecurity incidents, and to plan for potential threats and attacks. But despite our best efforts, we encounter situations that lead to exposed vulnerabilities, hacked systems, and stolen data. The threat, regardless of size, is synonymous with impact. When faced with a security threat, our professional instinct is to fix the problem; clean things up as fast as possible and quickly return to what is most important, which is reinforcing your systems. In the chaos of a security crisis, hitting the pause button is the farthest thing from our minds.
However, the 48 hours immediately following an incident is the proper time to run a cybersecurity incident postmortem and prime opportunity for active and critical learning. It offers the chance to spin the challenge in a way that profits your organization by establishing a post-incident plan that asks the all-important ‘why’ questions.
If you already have postmortems embedded into your security procedures, great job! This blog will help you build on what you have and provide best practices to follow for maximum optimization. If this concept is new to you, you’ll learn about its added value to your security routine and get tips on how to get started.
There’s even more to discover in our free eBook: Foundations for Incident Response Readiness. Download for quick access to practical ways to help your organization develop an effective incident response plan.
A Welcome Crisis?
Large enterprises have the luxury of expert teams dedicated to organizational reform that own the postmortem process. However, any crisis, big or small, can debilitate small and mid-sized businesses (SMBs), who have limited resources; physical and personnel to address them. For this reason, SMBs have more at stake, as a severe incident can have a long-term impact on revenue streams and business growth.
This may seem counterintuitive, but a crisis can be constructive. A postmortem in the aftermath of an incident does more than brace your business for impact. It uncovers systemic gaps, fosters teamwork, improves departmental coordination, and provides actionable insights for strategic improvements. In other words, incorporating the tool of reflection in the postmortem process reveals prospects for continuous business improvement.
Root Cause Analysis
A Root Cause Analysis (RCA) is a problem-solving technique credited to Sakichi Toyoda, the founder of Toyota, made popular by the manufacturing industry and used to detect the primary source of a problem. If postmortem is too macabre, substitute with RCA as their function is the same.
An RCA is a facilitated meeting, or series of meetings to review the issues, i.e., what went wrong, and devise a strategy to address it. Every business is unique, and at liberty to determine which incident is worth conducting an incident analysis based on severity. However, any outage, breach, attack, breakdown, or data loss that a long time to clean or fix is worth analyzing.
The benefit of an RCA is that it prompts a comprehensive analysis. A missing patch or software vulnerability could cause a specific issue, but a person’s action or inaction could also be a contributing factor. The role of an RCA is to question everything related to the security issue, from internal processes to partners and external factors.
A subsect of the RCA is the post-incident report, which details a full scope of the attack and provides a cross-functional understanding of the core issues in each stage, from start to finish.
Ground Rules for a Successful Incident Response Postmortem
It is important to set guidelines while facilitating an RCA. Crises are difficult to manage, and dismissing certain factors like system malfunctions, power dynamic, conflicting perspectives and diverse knowledge and skill set, could cause an escalation. These can be sensitive situations that need to be broached with forethought and care.
At ActZero, we follow tenets of incident response postmortems that are applicable to businesses of any scale:
- Explore through talk and debate, not blame
An RCA is not a finger-pointing exercise. For a productive process, establish psychological safety standards. All participants need to feel supported to explore the incident fully and honestly, with no fear of reprimand.
- Emphasize the importance of knowledge transfer
At its core, the RCA is a learning tool. You can break the entire process into three: a statement, a question, and answering it. If this happened, then why and how can we prevent a recurrence?
- Cultivate cross-functional learning and communication
Major incidents affect areas downstream from IT, with related functions required to play a role in the postmortem. This is an opportunity for teams to learn from each other, share perspectives, and foster dialogue.
- Be customer-oriented
Even when cybersecurity incidents do not have a direct impact on customers, how you provide services and disburse products remains integral. The RCA process empowers teams to problem-solve together and take ownership over strategies and solutions that best serve your customer base.
The Incident Postmortem Process in Brief
People are forgetful. To find out “what happened,” the RCA should ideally take place within 24 hours of the incident. Distribute an incident template beforehand for everyone to fill out.
Designate a facilitator who is an effective communicator; ideally an unbiased party with no stake in the incident but with a vested interest in improving the company. Kickoff the meeting by introducing the guiding principles. Reiterate the reason for the process, and that there is complete ‘immunity’ from consequences. Ask for feedback.
Then proceed to the grunt work. Establish the timeline of the incident as a group by comparing templates and gathering supporting materials—such as emails, charts, logs. Diagnose what happened. Identify what went well, what went wrong, and places where you “got lucky.” Make sure everyone feels included and empowered to take part. Employ the “Five Whys” technique, a simple tool in solution-focused problem analysis, to get to the root cause and reimagine the scenario backward to assess how the incident could have been prevented. Explore the lessons learned and vote for the best ones. Put a project plan in place to address root causes, designate teams, assign a leader, and establish a timeline to resolve the identified problems. Again, ask for feedback.
The Benefits of Incident Response Postmortems
The value of cybersecurity incident postmortems is extensive. It not only identifies the pain points but reveals the unknown facets of your processes and approach.
The benefits of cybersecurity incident postmortems are not only in identifying what went wrong but in revealing what you didn’t even know about your processes and approach. Impactful realizations can emerge from fresh perspectives that are only possible after facing challenges.
Cybersecurity incidents will happen. While many of you have incident postmortem processes in place, it’s worth taking the time to assess your approach. Doing them right can yield so much more than just checking the boxes. Perhaps even more critically, working towards positive solutions begets more positive solutions and helps build a culture of retrospection across your company.
If you’ve never run a postmortem, it doesn’t have to be a huge undertaking. A single department can benefit from observing, learning and improving. The important thing is to turn incidents into opportunities for continuous operational improvement.
All that to say, spin security failures into positives, and use incidents as an opportunity to gain new insights, reset and grow. Your business deserves to be protected and will benefit greatly from an incident response plan. What you do before, during and after a cybersecurity incident matters. Learn how to improve your organization’s preparedness for when threats come knocking. Download your complimentary copy of our ebook: Foundations for Incident Response Readiness for instant access.