With the over-hype of AI, it’s tough to blame people for thinking that they might be able to achieve a similar outcome using rules or basic statistics (the folks you should really blame are the marketing people!) … That being said, in this blog post I’m going to explain why these simple heuristics will pale in comparison to what a proper data science-fueled machine learning algorithm can do.
What do we mean by a Statistical / Analysis approach?
Many cybersecurity detections begin as simple queries or rules that originate from either known incidents or expert knowledge. These simple rules or heuristics can be great! They are usually informed by evidence and often tailored to a specific company’s environment. These detections can range from string matches for known indicators of attack (IOAs) to statistical approaches that analyze whether a given event or set of events falls into a typical distribution for the endpoint or environment.
Think statistics/analysis is sufficient? Think again
Despite their advantages there are significant drawbacks to these basic analytical approaches. Static rules and heuristics risk becoming stale. When a query is written against published indicators of attack (IOAs), for example, an attacker need only change the file name or hash slightly to evade detection. Distributions of events and commands may drift over time, with the number of alerts slowly creeping up as a result. And, many IT admins can likely relate to the ever-increasing length of allowlists some detections need to function as a company grows and diversifies.
Even user and entity behavior analytics (UEBA) is necessarily limited by the size of the entity pool from which the analysis is drawn. Behaviors that are new to a given user or environment may be perfectly benign, but get flagged due to their novelty alone. (While some UEBA products are based on data science methods like anomaly detection, many are built on basic statistics, and the latter are susceptible to false positives each time a user acquires new software or learns a new skill.)
A better approach: Anomaly Detection
Anomaly detection can complement a statistical or rules-based approach to mitigate certain drawbacks. In anomaly detection, the machine learning (ML) algorithm looks for outliers - in other words, anything that looks “weird.” Humans (and cybersecurity professionals especially) are natural anomaly detectors. Think of a time you’ve sifted through a bunch of data looking for that “needle in a haystack,” without thinking about the exact parameters of what you are looking for. We can’t expect humans to sift through the quantity of alerts generated in modern environments. Thankfully, anomaly detection algorithms work in much the same way, learning what is normal for an environment without ever being given strict boundaries.
Our anomaly detection algorithms use characteristics similar to what a cybersecurity expert would look at, or even to what a statistical approach might use, but with a complexity that would be challenging to write into a statistical heuristic and an ability to process far more events than a human. In addition, our anomaly detection models can go beyond what is normal vs. weird in a specific user environment to analyze trends across businesses similar to yours - we’ll come back to an example of why that’s useful in a minute.
Let’s look at some concrete examples. In the following scenarios, we’ll look at detections involving PowerShell commands. (For more information on PowerShell and other scripting that can be used maliciously, check out our Threat Insight.) In these examples, I will contrast the use of traditional rule- or heuristic-based approaches to detecting specific malicious PowerShell techniques with an anomaly detection approach.
- Example 1: Avoiding a False Negative from Obfuscation
Malicious actors often try to trick rule-based detections by throwing in obfuscating characters - for example, a command line that begins “-w 1 dow`nlo`ad(bad.exe)” might trick a simple string match looking for the term “download.” A common countermeasure would be to preprocess the command line, removing unusual characters or punctuation and thereby increasing the likelihood of correctly matching specific words against the detection rules. The problem is that this approach risks throwing the baby out with the bathwater. In this case, the command is anomalous precisely because it has those extraneous characters in the middle of a word. Additionally, many script obfuscators will use elements like environment variables that are difficult to process safely. An anomaly detection model can pick up on the strange way the word is split, the presence of unusual characters, and the presence of the word “download” simultaneously, greatly increasing the probability of a detection. - Example 2: Avoiding False Positives from Benign Processes
On the flip side, because many legitimate PowerShell scripts are functionally so similar to malicious PowerShell, they can often cause false positives on heuristic-based detection systems. Take this powershell command, common on machines running Visual Studio software:
“-NoProfile -InputFormat None -ExecutionPolicy Bypass -Command [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12; iex ((New-Object System.Net.WebClient).DownloadString('https://awebsite/install.ps1')); software1 upgrade -y python visualstudio2019-workload-vctools; Read-Host 'Type ENTER to exit'”.
It has multiple parameters that are often seen in attacks (ExecutionPolicy Bypass, WebClient, Download), and so could run afoul of rules-based queries that pick up on combinations of these indicators. If your company uses this software often, this file might be allowlisted. But if it’s an uncommon package in your environment or the name of the installation file recently changed, a heuristic detection might lead to false positives. An anomaly detection model trained across your full environment and many others like yours is much more likely to have seen this before and correctly classify it as normal behavior.
While it is theoretically possible to correctly classify both of these scenarios using heuristics and statistics, achieving the desired mix of accuracy and precision would require an ever-more-complex set of rules - one that reacted to each false positive or false negative. Layering rules like that generally results in low numbers of false positives, but also leads to missing real threats. In contrast, anomaly detection is both flexible and powerful, combining knowledge of your unique environment with other data to detect attacks without surfacing false positives.
For more information on how we’re applying machine learning to cybersecurity, check out this podcast featuring our Head of Data Science, Alexis Yelton. Or, to see our ML-driven detections in action, request a demo of our Managed Detection and Response service.