You invest in people, processes, and technology to prevent cybersecurity incidents, and plan what to do if they happen. But let’s face it: Incidents occur despite your best efforts. Vulnerabilities are exposed. Systems are hacked. Data is stolen. When your business is threatened, in a minor way or a major one: there’s an impact. There’s a tendency to want to just fix the problem and get it over with. Your professional instinct tells you to clean things up as fast as you can and get back to hardening your systems ASAP. No one wants to hit “pause” and dwell on a crisis while it is underway.
But the time right after an incident occurs is a critical moment. It’s the ideal time to learn, grow and improve. If your organization doesn’t have a post-incident plan in place that asks WHY the incident happened, you’re not turning that challenge into a learning opportunity. The first 48 hours after an incident is the perfect time to run a Cybersecurity Incident Postmortem. Unlike a murder investigation, the RCA may happen a week later without incident - but, if so, it must allow for the information about step-by-step actions to be documented without going from memory.
If postmortems are already part of your security procedures, read on to ensure you’re making the most of them by adhering to best practices. If the concept is new to you, read on for why you need them in your security routine and for tips on getting started.
A Welcome Crisis?
A crisis can be constructive and valuable – no matter how counterintuitive that seems. No one wants a crisis and prevention is crucial, but what you do right after an incident is just as important.
While large enterprises have entire teams devoted to organizational transformation that may take ownership of a postmortem, small crises can leave smaller and mid-sized companies feeling strapped, lacking the necessary time, resources, and personnel to address them.
SMBs may have even more at stake. Severe incidents can have a major impact on revenue streams and business growth in the months and years to come. Postmortems can do more than help prevent security problems down the road. The IT function touches every part of your business. Digging into what went wrong uncovers gaps, suggests strategies, fosters teamwork and improves departmental coordination. It lets your IT team reflect on its work. As such, postmortems are valuable tools for continuous business improvement.
Root Cause Analysis
If you don’t have a post-incident plan in place, don’t worry. The technique was established in the 1970s in industrial manufacturing by world-leading companies like Toyota, Motorola, 3M, and GM. Those best practices have since been translated for today’s business context. Whatever the framework, they agree on the ultimate objective: understanding the root cause of the incident. If “postmortem” is too macabre for you, just say RCA instead, for Root Cause Analysis.
An RCA is a facilitated meeting, or series of meetings, to review what went wrong and come up with a strategy to address it. Is every incident worthy of analysis? It depends on the severity. We feel that any outage, breach, attack, breakdown, or data loss that takes time to clean or fix is one worth addressing. Why? Anything that forces you to take a step backward tells you there’s room for improvement.
The beauty of the RCA is that it prompts a comprehensive analysis, one that looks for root causes anywhere in an organization. A given issue could be caused by a missing patch or software vulnerability, but a person’s actions may also be the cause and reveal a gap that’s addressed with a software fix. To attain these insights, RCAs need to question everything that touches the security issue, from internal processes to partners and external factors.
Ground Rules for Success
It’s important not to rush headfirst into an RCA without setting down guidelines. Crises are difficult. Systems may have malfunctioned. Colleagues may have made errors in judgment. These sessions bring co-workers together who have different knowledge sets, varying levels of power and responsibility inside the company, and conflicting perspectives. These can be sensitive situations that need to be broached with forethought and care.
The tenets we follow are applicable in businesses of any scale:
1. Explore through talk and debate – not blame
An RCA is not a finger-pointing exercise. You need to establish psychological safety for the process to work. All participants need to feel supported to explore the incident fully and honestly, with no fear of reprimand.
2. Empower your teams with the knowledge to avoid future challenges
At its base, the RCA is a learning experience. The entire process boils down to a statement, a question, and the process of answering it. “This happened.” “Why?” The ensuing discussion is educational.
3. Cultivate cross-team learning and communication
Major incidents impact areas downstream from IT, with related functions having a role to play in the postmortem. This is an opportunity for teams to learn from each other, share perspectives, and foster dialogue.
4. Bring the focus back onto your customer
Even when incidents don’t directly impact customers, they reveal everything about how you provide services or products. The RCA process empowers teams to problem-solve together and take ownership over strategies and solutions that best serve your customer base.
The Process in Brief
People are forgetful. To find out “what happened,” the RCA should ideally take place the day after the incident. Distribute an incident template beforehand for everyone to fill out.
Designate a facilitator who’s a great communicator, ideally an unbiased party with no stake in the incident and no interest except for improving the company.
Kickoff the meeting by introducing the guiding principles. Ask if everyone feels safe. Reiterate the reason everybody is there, and that there is complete ‘immunity’ from consequences.
Establish the timeline of the incident as a group by comparing templates and gathering supporting materials – such as emails, charts, logs.
Diagnose what happened. Identify what went well, what went wrong, and places where you “got lucky.” Make sure everyone feels included and empowered to participate.
Employ the “Five Whys” technique (pioneered by Taiichi Ohno, of Toyota Motor Corporation) to get to the root cause, and reimagine the scenario backward to assess how the incident could have been prevented.
Explore the lessons learned and take a vote on the best ones. Put a project plan in place to address root causes, designate teams, assign a leader, and establish a timeline to get problems fixed.
The Big Reveal
The value in cybersecurity incident postmortems is often not only in identifying what went wrong but in revealing what you didn’t even know about your processes and approach. Impactful realizations can emerge from fresh perspectives that are only possible after facing challenges.
Cybersecurity incidents will happen. While many of you have incident postmortem processes in place, it’s worth taking the time to assess your approach. Doing them right can yield so much more than just checking the boxes. Perhaps even more critically, working towards positive solutions beget more positive solutions and helps build a culture of retrospecting across your company.
If you’ve never run a postmortem, it doesn’t have to be a huge undertaking. A single department can benefit from stepping back and taking the opportunity to learn and improve. The important thing is to start turning incidents into opportunities for continuous operational improvement.
To learn more about what to do before and during cybersecurity incidents, download our white paper: Foundations for Incident Response Readiness.