How many times have you struggled with the question: what does success look like? At ActZero, our data-driven approach to cybersecurity invites us to grapple daily with measuring and evaluating the work we do on behalf of our customers.
Like many, we first turned toward the standard metrics used in cybersecurity, built around a “Mean Time to X” (MTTX) formula, where X indicates a specific milestone in the attack lifecycle. These milestones include factors like Detect, Alert, Respond, Recover, or even Remediate.
As we started to operationalize our unique machine-learning driven approach, however, we realized that “speed” measures weren’t telling us the whole story. More profoundly, only measuring speed wasn’t as applicable in an industry when machine-driven alerts and responses were happening in fractions of seconds.
Rather than relying solely on the old MTTX formula, we borrowed a long standing idea from another time-sensitive industry: video streaming. Your favorite video streaming platform (Netflix, YouTube, Amazon) cares about two core principles: speed and signal quality. Simply put: your video should both arrive reliably within a certain time (Speed) and your video should look great when it does (Quality). Let’s face it: who cares if the video stream carrying your team’s game shows up on your screen fast if you can’t see them score the goal!
The concept applies to cybersecurity alerts as well: we want to know that alerts are arriving reliably within a certain time (Speed), and that those alerts aren’t wrong (Quality). In the case of cybersecurity, it doesn’t matter how quickly you alert on a detection that is “wrong... or worse you get buried by “wrong” detections!
So we borrowed a simple, yet powerful, measure from our video streaming colleagues: Signal-to-Noise Ratio (SNR). SNR is the ratio of the amount of desired information received (“signal”) to the amount of undesired information received (“noise”). Success is then high signal with minimal noise, while maintaining specific TTX targets (note the lack of “mean” here … more on that later!)
So, let’s walk through some of the shortcomings of mean time metrics to understand how considering SNR as well will serve your SOC better. By understanding SNR for cybersecurity, you’ll be better equipped to assess security providers in a market of blossoming AI-driven solutions, and you’ll have a better indication of what makes for a quality detection (rather than a quick but useless one).
Outliers influence mean times
… that’s how “means” work. Means are averages and therefore can smooth volatile data values and hide important trends. Remember that when we calculate an average TTX, we are really saying 50% of the time we are better than our average and 50% of the time we are worse. Therefore, when we discuss means internally, we always use “total percentage n” for more accuracy (there’s an explanation here) to understand what percentage of the time the mean is applicable. When we say TTX of 5 seconds at TP99, we’re really saying 99 out of 100 times, we hit an TTX of 5 sec. This total percentage could help you understand how likely it is that your incident will be an actual “outlier” and cost you days of remediation and potential downtime!
Mean times are a legacy metric
As a measurement standard, mean times are a legacy paradigm brought over from call centers. Over time, cybersecurity leaders adopted similar metrics because IT departments were familiar with them.
In reality, mean times don’t map directly to the kind of work we do in cybersecurity. We can’t entirely generalize them to be meaningful indicators across the attack lifecycle. While these averages might convey speed relative to specific parts of the attack lifecycle, they don’t provide any actionable information other than (potentially) telling you to “hurry up!’ In the best case, MTTX becomes a vanity metric that looks great on an executive dashboard, but provides little actual business intelligence.
Signal-to-noise ratio measures quality detections
The fastest MTTX is not worth anything if it measures the creation of a bogus alert! We want our mean time metrics to tell us about actual alerts; not be skewed by bad data.
So, how does an untuned MTTX tell you about the quality of work your security provider does, or how safe it makes your systems? You’re right! It doesn’t!
If you really want to understand the efficacy of your security provider, you need to understand both the breadth of coverage and the quality of detections. The speed vs. quality challenge is why we think in terms of SNR rather than mean times. For providers or those running a SOC, it’s the signal of quality detections relative to the plethora of benign or other noise that will enable you to understand your SNR and use it to drive operational efficiency see my colleague’s post ”Scale Your Security Operation by Focusing on SOCe”.
Look at how many quality detections your cybersecurity provider raises relative to the number of bogus alerts to understand the real measure of how successful they are at keeping your systems safe.
How ActZero can help
There are better measures than MTTX to evaluate cybersecurity efficacy. We recommend thinking in terms of signal-to-noise to better measure the quality and breadth of detections made by your security provider. New metrics like signal-to-noise will be crucial as cybersecurity solutions are empowered through AI and machine learning to react at machine speed.
To explore our thinking on this more deeply, check out our white paper in collaboration with Tech Target, “Contextualizing Mean Time Metrics to Improve Evaluation of Cybersecurity Vendors”.