The Science of Detection Part 3: A Closer Look at the “Detectors” You Rely on When Hunting for Evidence

This is the third blog in my science of detection series. In the previous parts, we examined the key elements of a data source and considered integrated reasoning. Today, I’ll be taking a closer look at the signal quality we get from the various “detectors” that we use to find malicious activities in our environment.

Be sure to check back in the coming weeks to see the next blogs in this series. In part four, I’ll be talking about architectural approaches to detection, and looking at how we collect and aggregate information so that it’s useful to our security programs. I’ll be making some predictions about the progress we’ll see in this area in the future, because I think the old way of doing things has reached a dead end.

Security analysts have many different information sources—“detectors”—to consider when making decisions about whether or not they see malicious activity taking place in their environment. Each detector has a purpose, and each contributes some degree of differential value to the ultimate decision, but only a few of them were specifically designed for security applications. That complicates things.

What’s interesting about these information sources is that each must be interpreted and analyzed in a different way in order to assemble enough information to get a truly comprehensive picture of what’s taking place in the environment. They also operate at different levels of abstraction (for example, signatures are much more abstract than raw data), which means that a key task in analyzing any attack is assembling a corroborative summary using as many diverse information sources as possible.

Assembling such a summary involves multidimensional analysis. It’s tremendously important that we bring the latest advances in analytical reasoning and mathematical and scientific research to bear on our security programs and how we leverage information within them.

With this in mind, let’s talk about the information sources we use, explain their most common applications, and put them into context.

Raw Data

Network packets are all the communications that transit your network. Very often they’re encrypted. The highest-end security programs might include complete packet capture, but that gets very expensive quickly. Packet capture offers the highest fidelity but most dilute set of information for incident detection. A short-term packet capture solution (that holds data for 30-60 days) often ends up being of little use forensically because incidents are most often detected later in their lifecycle. The next-best-thing to complete packet capture is probably a combination of NetFlow and network security sensors.

Logs, at their most basic, are just records of system or user activity. Some of them are relevant for security detection purposes, but most are not. Historically speaking, logs were usually written to manage application and system problems, and they tend to be highly inconsistent in their content, their format, and their usefulness for security.

When a specific security control is violated, or an attempt to violate it is made, a log event is generated. There’s always some chance that the activity is malicious in nature. How big is this chance? Well, it’s different for every single log message and log source. This makes the aggregation and timeline of logs more important than any single log event when it comes to inferring or understanding malicious activity.

This is why we use rules. Rules help us interpret and contextualize logs, and thus slightly improve their utility for detection purposes.

The problem is: how many failed logins does it take before you know you have a hijacked account instead of a forgetful user? How different is the number of failed logins it would take to raise our suspicion on a Monday morning from what it’d take on a Wednesday afternoon? Sometimes we do see security avoidance behaviors in logs (for instance, clearing them), but user mistakes can and do explain these things most often, and it’s hard to know when to dig in.

Meta-Data

Network flow data show the connection details and the amount of data transferred between hosts on your network (and out to the Internet). They’re like the network equivalent of monitoring who’s calling whose cell phone within a criminal syndicate. Network graph analysis and visualization are useful approaches to understanding NetFlow data.

Indicators (of malicious or suspicious activity)

Signatures of known attacks and other indicators of malicious code may be detected through sensors when monitoring network communications. These are short, hexadecimal character sequences known to be contained within attack payloads. In order to ensure a match when an attack occurs, even when written with a highly specific sequence of bytes in mind they often don’t account for all other possibilities of non-malicious occurrences of the same sequence in a data stream and thus they’re written loosely and thus produce a large number of false alerts. There are currently over 57,000 IDS signatures in existence: only a tiny subset of these are relevant at any given moment in time. This produces a high volume of false or nuanced alerts, further obscuring valuable detection signals. Signatures benefit from being analyzed by machines rather than humans because of the depth of analysis needed to separate out the relevant information. It’s also very important to consider where and how you place sensors because their value is directly related to their visibility.

Threat intelligence is another indicator. Yes, it also suffers from a volume problem, and its volume problem is almost as bad as that of network security sensors. Threat intelligence lists try not to omit potential malicious attacks and thus produce a high volume of alerts, which are hard for humans to analyze. Threat intelligence includes lists of IP addresses, domains and known bad file hashes. I consider known good file hashes to be valuable intelligence, too. Once again, combinations of threat indicators offer much higher fidelity as evidence of real threat activity.

Heuristics are behavioral indicators. For example, an alert might be generated when a piece of software takes an action that’s not normal for that software, such as spawning an additional process outside of user-approved space. Heuristics are a library of past incident observations, and as such, are completely historically focused. Although it’s valuable not to fall for the same thing twice, these tend to have a short lifespan when it comes to high accuracy.

First Order Processing

Rules follow a predictable structure (Activity — Threshold — Context — Action) to identify known suspicious activity. Known suspicious activities are described using Boolean logic or nested searches, a threshold is set, and if this is reached or crossed, a notification is sent to a monitoring channel for human evaluation.

At the most atomic level, there are fewer than 130 rules in regular use. In fact, in most organizations fewer than 45 are implemented. Rules are most valuable when they’re used to enforce logic that’s specific to your company’s unique business challenges, such as possible fraud scenarios.

Context—additional information about the entities being investigated and the relationship between them—can help you answer questions about the potential impact of attacks in progress and your vulnerability to them. It’s a key component in initial processing.

Statistics and metrics are important in guiding your operations: self-reflection and dispassionate measurement are critical to the effective application of detection science. You can measure attributes like coverage and performance, or calculate cost- or time-per-detection by data source and use this information to guide you in deploying your sensor architecture. Statistical analysis can be a powerful tool for uncovering attackers’ latest stealth techniques. Any activity that’s too close to the center of a normal bell curve might be hiding something in the noise—says the ever-suspicious security investigator.

Second Order Processing

Behaviors, patterns, and baselines are very commonly used to measure and score users’ stealthy or suspicious behaviors. The goal is to identify the users who either pose an insider threat or whose machines have been compromised by malicious code. Maintaining a library of first-order information that you’ve collected over time and conducting periodic calculations against it can help you pinpoint things that might be suspicious. “Repeat offender” is a catchphrase for a reason.

Nth Order Processing

Anomalies, clusters, affinity groups, and network graphs can reveal some very nuanced attacks. Running advanced algorithms across large amounts of data can yield interesting results.

A common fallacy is that anomalies are more likely to be malicious. That’s simply not true. The way our networks are interconnected today makes for all sorts of anomalies in all layers of the technology stack. These provide investigators the same sort of analytical puzzle as network security signatures do.

Some of these algorithms have well-understood security applications. One example is clustering: when you cluster IDS data, what you find most often are false positives, because they occur in highly predictable ways. When a particular signature generates alerts for what’s actually regular business traffic, the same alert will be triggered every time that business process takes place. It thus produces a very obvious cluster that you can exclude when looking for malicious activity.

The more information known to be unimportant that we can remove, the more clearly we can see what else is going on. This is where analytical detection comes into its own. Very often, we run algorithms on security data simply to see if a subject matter expert can interpret the outcome. Possessing both domain expertise and knowledge of data science is critical if you want to understand what advanced algorithms are telling you.

Visualization and hunting are an nth order processing task. Using tools that allow you to pivot and display related datasets is the ultimate form of security threat hunting, and it’s also the most fun. You can derive some detection value from considering any layer of detectors through the lens of a visual tool.

Do you think I’m about to tell you there’s another layer called “artificial intelligence”? If so, you’re wrong.

The next layer is simply making a decision: has something malicious occurred? The more information we have to feed into the decision-making process, the more effective and deeper the decision will be. All of the information sources listed above have something of value to contribute.

But you have to ask yourself: how many of these factors can analysts consider in real time as they watch events streaming across a console?

If you’d like to make it possible for your security operations team to incorporate input from a greater variety of detectors and information sources into their decision-making processes and workflows, consider adding the Respond Analyst to your team. Built to integrate with a broad array of today’s most popular sensors, platforms and solutions, the Respond Analyst brings specialized threat intelligence and detailed local contextual information to bear on every decision it makes about which events to escalate. Quite simply, it’ll give your ability to interpret and analyze detection data a boost—and allow your analysts to consider a far wider variety of sources.

To learn more about how the Respond Analyst can help your business become more thorough and derive greater insight from the detectors in your environment, contact us to schedule a demo today.

How Automating Long Tail Analysis Helps Security Incident Response

Today’s modern cybersecurity solutions must scale to unparalleled levels due to constantly expanding attack surfaces resulting in enormous volumes of diverse data to be processed. Scale issues have migrated from just the sheer volume of traffic, such as IOT led DDoS attacks and the traffic from multiple devices, to the need for absolute speed in identifying and catching the bad guys.

Long tail analysis is narrowed down to looking for very weak signals from attackers who are technologically savvy enough to stay under your radar and remain undetected.

But, what’s the most efficient and best way to accomplish what can be a time-consuming and a highly repetitive tasks?

What is Long Tail Analysis?

You might be wondering what the theory is behind long tail analysis, even though you’re familiar with the term and could already be performing these actions frequently in your security environment.  The term Long Tail first emerged in 2004 and was created by Wired editor-in-chief, Chris Anderson to describe “the new marketplace.” His theory is that our culture and economy is increasingly shifting away from a focus on a relatively small number of “hits” (mainstream products and markets) at the head of the demand curve and toward a huge number of niches in the tail.

In a nutshell and from a visual standpoint, this is how we explain long tail analysis in cybersecurity:  You’re threat hunting for those least common events that will be the most useful in understanding anomalous behaviour in your environments.

Finding Needles in Stacks of Needles

Consider the mountains of data generated from all your security sources. It’s extremely challenging to extract weak signals while avoiding all the false positives. Our attempt to resolve this challenge is to provide analysts with banks of monitors displaying different dashboards they need to be familiar with in order to detect malicious patterns.  As you know, this doesn’t scale.  We cannot expect a person to react to these dashboards consistently.  Nor do we expect them to “do all the things”.

Instead, experienced analysts enjoy digging into the data.  They’ll pivot into one of the many security solutions used to combat cybersecurity threats such as log management solutions, packet analysis platforms, and even some endpoint agents all designed to record and playback a historical record.  We break down common behaviours looking for those outliers.  We zero in on these ‘niche’ activities and understand them one at a time. Unfortunately, we can’t always get to each permutation and they are left unresolved.

Four Long Steps of Long Tail Analysis in the SOC

If you are unfamiliar with long tail analysis, here are 4 steps of how a typical analyst will work through it:

Step 1: First, you identify events of interest like a user authentication or web site connections.  Then, you determine how to aggregate the events in a way that provides enough meaning for analysis. Example:  Graph user account by the number of authentication events or web domains by the number of connections.

Step 2: Once the aggregated data is grouped together, the distribution might be skewed in a particular direction with a long tail either to the left or right.  You might be particularly interested in the objects that fall within that long tail.  These are the objects that are extracted, in table format, for further analysis.

Step 3: For each object, you investigate as required. For authentications, you would look at the account owner, the number of authentication events, the purpose of the account.  All with the intended goal of understanding why that specific behaviour is occurring.

Step 4: You then decide what actions to take and move on to the next object.  Typically, the next steps include working with incident responders or your IT team.  Alternatively, you might decide to simply ignore the event and repeat Step 3 with the next object.

Is There a Better Solution?

At Respond Software, we’re confident that long tail analysis can be automated to make your team more efficient at threat hunting. As we continue to build Respond Analyst modules, we move closer to delivering on that promise — and dramatically improve your ability to defend your business.

As Security Analysts, Instead of Threat Hunting We’ve Become Ticket Monkeys

We’ve heard repeatedly from security analysts (like those interviewed in Cyentia’s Voice of the Analyst Survey) that event monitoring is time-consuming, boring, and repetitive, that security analysts feel like ticket monkeys interfacing with IT, and only occasionally do they get to do the fun work of threat hunting.

But did you know that EPPs (Endpoint Protection Platforms, commonly called Next-Gen Antivirus, NGAV or AV) are a foundational data source in security operations but can also be a time sink for security analysts to evaluate and act.

Generally, EPPs generate high-fidelity alerts; the system is likely infected with malware. Given this alert, a security analyst must decide if:

1. the infected system presents a serious threat to the organization and an incident response procedure is

required

2. the system is in fact infected but the threat is not that serious and can be safely mitigated by creating a

ticket for IT or simply reimaging the machine

3. the alert can be dismissed because it is not a threat and no action is required at this time

And how does a skilled security analyst come to an accurate and appropriate decision?

Context. Context. Context.

A security analyst must understand the importance of the involved systems and accounts. Is this a server or a workstation? Is this the CEO’s laptop? Do the systems have any vulnerabilities?

Security Expertise.

Not all malware is created equally.  A security analyst must understand the type of malware, its function, potential harm, and ability to spread.  Analysts gain expertise on the job, through research, or arduous certifications (of which they need to keep maintained).

Experience.

Good security analysts won’t assume that the action taken by the endpoint agent (aka EPP) will fully remediate the issue, they will look for other indicators and evidence.   For example, corroborating and relevant network IPS alerts.  Experienced analysts know that when one malware is observed, likely more are lurking.

Awareness.

Of course, the security analyst must qualify if this threat is even relevant to their environment. Conversely, the threat could be part of something ongoing within their organization or an external campaign.

A thorough analysis of the situation and making the appropriate decision takes time.

On top of that, interfacing with IT and generating tickets to remove commodity malware from a workstation may not be meeting the expectations of hungry analysts eager to be hunting for bad guys.

It’s no surprise SOC teams are falling behind their unrelenting event loads and 1 in 4 security analysts express dissatisfaction with the current job.

But wait…

There is a solution besides wringing hands or hiring more analysts. Turns out, we created a Virtual Security Analyst to expertly analyze malware events and recommend a course of action. And get this, our virtual security analyst is fast, scalable, and 100% (yes, that’s right) 100% consistent in performing dozens of checks while evaluating every event.  On top of that, Respond Analyst integrates with most ticketing and case management solutions, elevating your analysts from time-consuming ticket creation processes.

Don’t you just want to learn more why we were named one of Gartner’s Cool Vendors?

Please reach out to learn how to augment your team with the Respond Analyst today.

Join our growing community! Subscribe to our newsletter, the "First Responder Notebook," delivered straight to your inbox.