Stop Tilting at Windmills: The Failure of SIEM Correlation Rules and Queries
When Don Quixote and his trusty squire embark upon their visionary quest, they’re eager to slay their enemies and defeat all forms of injustice. Nothing can hold Don Quixote back from launching himself into his first battle: the famous sword fight against the windmills of La Mancha. Seeing them rising across a distant plain, Don Quixote believes them to be a band of hulking, ferocious giants, and he’s quick to draw his sword.
In cybersecurity operations, we too are often keen to defeat our adversaries, and dedicated to battling threats. We return to our consoles each day, ready to resume the fight against cybercrime. But are the actions we’re taking truly effective? Or have we, perhaps, taken up arms against a false foe?
Let’s take a closer look at two activities that cybersecurity analysts spend a great deal of time on: writing correlation rules and querying event log data. Both are efforts to create or find meaningful relationships in what might seem like an unending, limitless sea of data. Both are unquestionably valuable activities when performed properly. And both have the potential to distract your analyst team from a concerted focus on their true enemy.
Brandishing the Sword of Correlation Rules May Leave You Exposed
Correlation rules enable Security Information Management (SIM) or Security Information and Event Management (SIEM) tools to define the sets of conditions under which they’ll issue an alert. Out-of-the-box, most SIM/SIEM systems come with 500 to 1,000 pre-defined, standardized correlation rules. Of course, these rules weren’t developed to take your organization’s individual needs, typical users’ behaviors, or particular risk profile into account. And naturally enough, they leverage very little to no contextual information when making each alert decision.
Correlation rules offer teams the opportunity to better identify patterns in their environments’ telemetry data, and to highlight events that have a high probability of being malicious. The problem is that correlation rules are used in cybersecurity analysis and they're inherently prone to producing false positives.
Let’s say that your team has set a correlation rule to trigger an alert when there are excessive failed login attempts on any user account. We might define “excessive” as five, for example, only to discover that one particular user frequently forgets her password, taking up far more of the security team’s time than this particular alert type should. So your team will need to adjust the rules to ensure that they’re a good fit for the regular user behavior—and error—patterns that exist within your organization.
But adjust the number of login attempts considered “excessive” too far downwards, and you’ve made your organization vulnerable to brute-force style password-guessing attacks. Most correlation rules are bounded by a time frame as well. A common pattern is alerting on more than five attempts within a five- or ten-minute period. Sophisticated adversaries know these patterns or are actively trying to learn them, though. They might start out with four failed login attempts within ten minutes, then introduce a delay (during which the correlation rule resets) and follow that with another sequence of attempts.
Because correlation rules are narrow and fixed, and adversaries’ tactics expansive and ever-changing, your team will need to invest a great deal of time into rewriting and adjusting the rules if they are to be truly effective. If correlation rules are set to minimize the false positive rate, they could well be leaving your environment overly exposed.
Unlike correlation rules, which are created to alter systems’ alerting behavior in the future, queries are retrospective. They’re searches or analyses conducted across large volumes of data, from system and user activity logs, threat feeds, intrusion detection system (IDS) or intrusion detection and prevention system (IDPS) logs, and logs from firewalls, or other applications and sensors. When cybersecurity analysis includes queries, security teams are seeking additional information about events that occurred in their network’s recent past.
Say, for example, your team just received new threat intelligence then an IP address is malicious. You would conduct a query to see whether or not a host within your environment had established a connection to that IP address within the past few days. As with correlation rules, your team must have a clear idea of what you’re looking for. Like correlation rules, queries return a binary result (a ‘yes’/‘no’ or ‘true’/‘false’ answer.)
The Shadowy Shades of Grey Where Adversaries Hide
Both correlation rules and queries are attempts to discern malicious behavior within your environment, but both are limited by the fact that they’re one-size-fits-all approaches. Adversaries’ goal is to hide within the “almost normal” zone: their actions may trigger the sorts of alerts that, if cursorily examined by your team, seem like they’re not all that different from typical user behaviors. Their actions will often be perilously close to things your correlation rules won’t alert on.
Members of your security team will probably end up spending a great deal of time writing and rewriting correlation rules. High false-positive rates are a fact of life, and a challenge they will face at some point. And in this situation, frustration is an understandable and natural response. Content teams will want to bring down the false positive rates, since doing so allows the cyber incident response team to better allocate their limited time and attention. But if your security team believes that false positives are the enemy, you’ve taken up arms against the wrong adversary.
Cyber incident response teams are on the opposite side of the same problem: when event monitoring teams deliver them an unending stream of false positives, it’s tempting to get sloppy, tired, or irritated. And it can be easy to approach new alerts with the misguided presupposition that they’re likely to be false positives just because the last few hundred alerts you investigated were. Unfortunately, the real attacks will almost inevitably arrive in the midst of a stream of false positives. As long as overall false-positive rates are as high as they are today, this will always be the case.
Shield Your Team with Security Operations Software
Correlation rules and queries have a valuable role to play within multi-layered security approaches. When properly written, correlation rules can complement other tools to help security teams identify the low-hanging fruit in their environments—rules can point out misconfigured systems, alert in the presence of known malware signatures, or identify patterns of traffic to known malicious IPs, for instance.
But correlation rules can’t and won’t identify the most sophisticated attacks. A better approach couples them with advanced security operations software that relies on probability theory-based algorithms to consider every alert within a broad and deep context. By instead using combinatorial correlation underpinned with Bayesian mathematics, security operations software can learn to look inside the “grey areas” and tease out the likelihood of attacks concealed in near-normal actions.
Security operations software like the Respond Analyst use combinatorial correlation to ask and answer questions such as: Is this server vulnerable to attack? Is this a signature I should worry about? In what ways does this action appear to be malicious, and in what ways doesn’t it? How is this related to other attacks or previous incidents? These are the kinds of questions an experienced human security analyst will ask.
But software can pose and respond to these questions consistently, reliably, and predictably. Without taking coffee breaks or vacations, without leaving between-shift gaps in coverage, and without becoming frustrated or fatigued. It's as though Don Quixote has finally defeated the windmills--with unending access to tireless rationality and logical thought.