The Science of Detection Part 3: A Closer Look at the “Detectors” You Rely on When Hunting for Evidence
This is the third blog in my science of detection series. In the previous parts, we examined the key elements of a data source and considered integrated reasoning. Today, I’ll be taking a closer look at the signal quality we get from the various “detectors” that we use to find malicious activities in our environment.
Be sure to check back in the coming weeks to see the next blogs in this series. In part four, I’ll be talking about architectural approaches to detection, and looking at how we collect and aggregate information so that it’s useful to our security programs. I’ll be making some predictions about the progress we’ll see in this area in the future, because I think the old way of doing things has reached a dead end.
Security analysts have many different information sources—“detectors”—to consider when making decisions about whether or not they see malicious activity taking place in their environment. Each detector has a purpose, and each contributes some degree of differential value to the ultimate decision, but only a few of them were specifically designed for security applications. That complicates things.
What’s interesting about these information sources is that each must be interpreted and analyzed in a different way in order to assemble enough information to get a truly comprehensive picture of what’s taking place in the environment. They also operate at different levels of abstraction (for example, signatures are much more abstract than raw data), which means that a key task in analyzing any attack is assembling a corroborative summary using as many diverse information sources as possible.
Assembling such a summary involves multidimensional analysis. It’s tremendously important that we bring the latest advances in analytical reasoning and mathematical and scientific research to bear on our security programs and how we leverage information within them.
With this in mind, let’s talk about the information sources we use, explain their most common applications, and put them into context.
Network packets are all the communications that transit your network. Very often they’re encrypted. The highest-end security programs might include complete packet capture, but that gets very expensive quickly. Packet capture offers the highest fidelity but most dilute set of information for incident detection. A short-term packet capture solution (that holds data for 30-60 days) often ends up being of little use forensically because incidents are most often detected later in their lifecycle. The next-best-thing to complete packet capture is probably a combination of NetFlow and network security sensors.
Logs, at their most basic, are just records of system or user activity. Some of them are relevant for security detection purposes, but most are not. Historically speaking, logs were usually written to manage application and system problems, and they tend to be highly inconsistent in their content, their format, and their usefulness for security.
When a specific security control is violated, or an attempt to violate it is made, a log event is generated. There’s always some chance that the activity is malicious in nature. How big is this chance? Well, it’s different for every single log message and log source. This makes the aggregation and timeline of logs more important than any single log event when it comes to inferring or understanding malicious activity.
This is why we use rules. Rules help us interpret and contextualize logs, and thus slightly improve their utility for detection purposes.
The problem is: how many failed logins does it take before you know you have a hijacked account instead of a forgetful user? How different is the number of failed logins it would take to raise our suspicion on a Monday morning from what it’d take on a Wednesday afternoon? Sometimes we do see security avoidance behaviors in logs (for instance, clearing them), but user mistakes can and do explain these things most often, and it’s hard to know when to dig in.
Network flow data show the connection details and the amount of data transferred between hosts on your network (and out to the Internet). They’re like the network equivalent of monitoring who’s calling whose cell phone within a criminal syndicate. Network graph analysis and visualization are useful approaches to understanding NetFlow data.
Indicators (of malicious or suspicious activity)
Signatures of known attacks and other indicators of malicious code may be detected through sensors when monitoring network communications. These are short, hexadecimal character sequences known to be contained within attack payloads. In order to ensure a match when an attack occurs, even when written with a highly specific sequence of bytes in mind they often don’t account for all other possibilities of non-malicious occurrences of the same sequence in a data stream and thus they’re written loosely and thus produce a large number of false alerts. There are currently over 57,000 IDS signatures in existence: only a tiny subset of these are relevant at any given moment in time. This produces a high volume of false or nuanced alerts, further obscuring valuable detection signals. Signatures benefit from being analyzed by machines rather than humans because of the depth of analysis needed to separate out the relevant information. It’s also very important to consider where and how you place sensors because their value is directly related to their visibility.
Threat intelligence is another indicator. Yes, it also suffers from a volume problem, and its volume problem is almost as bad as that of network security sensors. Threat intelligence lists try not to omit potential malicious attacks and thus produce a high volume of alerts, which are hard for humans to analyze. Threat intelligence includes lists of IP addresses, domains and known bad file hashes. I consider known good file hashes to be valuable intelligence, too. Once again, combinations of threat indicators offer much higher fidelity as evidence of real threat activity.
Heuristics are behavioral indicators. For example, an alert might be generated when a piece of software takes an action that’s not normal for that software, such as spawning an additional process outside of user-approved space. Heuristics are a library of past incident observations, and as such, are completely historically focused. Although it’s valuable not to fall for the same thing twice, these tend to have a short lifespan when it comes to high accuracy.
First Order Processing
Rules follow a predictable structure (Activity — Threshold — Context — Action) to identify known suspicious activity. Known suspicious activities are described using Boolean logic or nested searches, a threshold is set, and if this is reached or crossed, a notification is sent to a monitoring channel for human evaluation.
At the most atomic level, there are fewer than 130 rules in regular use. In fact, in most organizations fewer than 45 are implemented. Rules are most valuable when they’re used to enforce logic that’s specific to your company’s unique business challenges, such as possible fraud scenarios.
Context—additional information about the entities being investigated and the relationship between them—can help you answer questions about the potential impact of attacks in progress and your vulnerability to them. It’s a key component in initial processing.
Statistics and metrics are important in guiding your operations: self-reflection and dispassionate measurement are critical to the effective application of detection science. You can measure attributes like coverage and performance, or calculate cost- or time-per-detection by data source and use this information to guide you in deploying your sensor architecture. Statistical analysis can be a powerful tool for uncovering attackers’ latest stealth techniques. Any activity that’s too close to the center of a normal bell curve might be hiding something in the noise—says the ever-suspicious security investigator.
Second Order Processing
Behaviors, patterns, and baselines are very commonly used to measure and score users’ stealthy or suspicious behaviors. The goal is to identify the users who either pose an insider threat or whose machines have been compromised by malicious code. Maintaining a library of first-order information that you’ve collected over time and conducting periodic calculations against it can help you pinpoint things that might be suspicious. “Repeat offender” is a catchphrase for a reason.
Nth Order Processing
Anomalies, clusters, affinity groups, and network graphs can reveal some very nuanced attacks. Running advanced algorithms across large amounts of data can yield interesting results.
A common fallacy is that anomalies are more likely to be malicious. That’s simply not true. The way our networks are interconnected today makes for all sorts of anomalies in all layers of the technology stack. These provide investigators the same sort of analytical puzzle as network security signatures do.
Some of these algorithms have well-understood security applications. One example is clustering: when you cluster IDS data, what you find most often are false positives, because they occur in highly predictable ways. When a particular signature generates alerts for what’s actually regular business traffic, the same alert will be triggered every time that business process takes place. It thus produces a very obvious cluster that you can exclude when looking for malicious activity.
The more information known to be unimportant that we can remove, the more clearly we can see what else is going on. This is where analytical detection comes into its own. Very often, we run algorithms on security data simply to see if a subject matter expert can interpret the outcome. Possessing both domain expertise and knowledge of data science is critical if you want to understand what advanced algorithms are telling you.
Visualization and hunting are an nth order processing task. Using tools that allow you to pivot and display related datasets is the ultimate form of security threat hunting, and it’s also the most fun. You can derive some detection value from considering any layer of detectors through the lens of a visual tool.
Do you think I’m about to tell you there’s another layer called “artificial intelligence”? If so, you’re wrong.
The next layer is simply making a decision: has something malicious occurred? The more information we have to feed into the decision-making process, the more effective and deeper the decision will be. All of the information sources listed above have something of value to contribute.
But you have to ask yourself: how many of these factors can analysts consider in real time as they watch events streaming across a console?
If you’d like to make it possible for your security operations team to incorporate input from a greater variety of detectors and information sources into their decision-making processes and workflows, consider adding the Respond Analyst to your team. Built to integrate with a broad array of today’s most popular sensors, platforms and solutions, the Respond Analyst brings specialized threat intelligence and detailed local contextual information to bear on every decision it makes about which events to escalate. Quite simply, it’ll give your ability to interpret and analyze detection data a boost—and allow your analysts to consider a far wider variety of sources.
To learn more about how the Respond Analyst can help your business become more thorough and derive greater insight from the detectors in your environment, contact us to schedule a demo today.