An Introduction to Integrated Reasoning
In part one of this two-part series, we introduce the concept of reasoning and the role of influences, or data sources, in how integrated reasoning is applied to cybersecurity.
Reasoning is the process by which we rationalize information, reduce uncertainty, and make decisions.
As humans, we make thousands of decisions every day – ranging from fast, instinctual decisions where the influences on our decision-making process are simple and transparent, to more complex decisions where the influences are opaque and perhaps perceived as irrational by others.
Regardless, the influences are there – they just need to be unpacked.
Machines can reason and make decisions too. Probabilities are simply the quantification of uncertainty. Therefore, mathematical models leveraging probabilities can emulate the decision-making process of humans, reducing uncertainty towards an outcome.
In applying this to cybersecurity, we first must identify the decision we wish to model.
The objective of the Respond Analyst is to outperform the monitoring and triage tasks of a human security analyst. The decision facing a human security analyst (and the Respond Analyst for that matter), is to decide if the observed activity responsible for the security alert is malicious and actionable, thus requiring an incident response to remediate the problem.
Now, to unpack that decision -- upon receiving an alert, the security analyst begins their investigation with a high degree of uncertainty that the alert is malicious and actionable, a product of the analyst’s incomplete and imperfect information about the situation (and that most alerts are false positives). Related to the domain of security, this pertains to the attacker and their motive, the target, and extracting the malicious activity from the noise of normal user and administrative traffic.
To address this uncertainty, analysts collect relevant contextual information and apply their prior knowledge through a series of triage steps (based on their experience, familiarity with their environment and their security expertise relating to appropriate network or endpoint telemetry and attack patterns) to subsequently reduce their uncertainty towards either a malicious and actionable or benign outcome.
Traditional approaches to cyber-defense have been based on deterministic mathematics, or rules, which result in a true or false outcome. But as described above, a deterministic result often does not appropriately characterize the uncertainty given imperfect and incomplete information. The results that probabilistic mathematics generate cannot simply be categorized as yes or no, but instead indicate degrees of accuracy and understand the shades of gray that exist within our data and interactions. At Respond, we call this process ‘Integrated Reasoning’, due to the number of influences upon the decision that needs to be integrated from other alerting telemetries, contextual sources, and threat intelligence solutions to answer our triage questions.
First, is the data accurate?
In many organizations, a security analyst cannot trust that the IP address or hostname in the security alert is representative of the true system that generated the alert. As organizations scale-out, add branches, move to the cloud, acquire companies – the challenges of managing data and of securing those networks only increase. In a company that uses Dynamic Host Configuration Protocol (DHCP) to dynamically assign IP addresses for wireless networks, a system may lease an IP address for a matter of minutes before that IP address is reassigned to another system. Therefore, it is highly likely that the IP address of a potentially infected system has changed between the time the alert was generated and the time a human analyst evaluates the data. What a headache!
To address this problem, we have developed a proprietary system service that constantly maintains the pairing between IP addresses and hostnames, updated by the most recent and accurate data received – be it a new DHCP lease, an endpoint protection event, or an Endpoint Detection and Response (EDR) alert. As a result, the Respond Analyst is sure that the subsequent triage questions can be attributed to the system in question. Accurate attribution is particularly important when evaluating behavioral patterns and the scope of an incident. As an example, a single system assigned multiple IP addresses over time that is infected with command and control malware, constantly beaconing to malicious domains, could appear a widespread malware outbreak if attribution to a single system is not done properly.
Next, who is involved?
Of the internal assets, what type of asset is being targeted? What are the asset's business function and criticality? Does it deal with sensitive information? Who is the user and what do they do? Does the asset have vulnerabilities?
Security alerts generally contain only an IP address or a hostname as the identifiable system information, so analysts must seek answers to the above questions by consulting other information repositories. Unfortunately, many organizations struggle to keep an up-to-date inventory of their assets, shockingly, even for critical assets. Not only does understanding the type, function, and criticality of an asset help with all the subsequent triage questions, but it also dictates the severity assignment of incident response procedures. Are we waking the CISO up for a business-impacting intrusion or are we going to submit a ticket to reimage an infected workstation?
We understand that data can go stale, so rather than relying solely on users to manually configure static context lists, the Respond analyst will make an inference into an entity’s classification and importance. For example, the Respond Analyst infers an asset’s classification based on vulnerability scan data and assigns an asset criticality (Critical = Domain Controller, DNS Server; High = Web Server, Database Server, File Server; Medium = Server; Low = Workstation). The Respond Analyst also does a similar exercise for user accounts.
Not all alerts are communication-based or contain external systems, but if they do, an analyst must understand who that system is. Where are they coming from? Are they a partner? Are they a customer? Are they a known bad actor? Are they trying to anonymize their identity? How recently was the domain registered?
Answering these questions alone requires several separate integrations: with IP geolocation data, organizational context, threat intelligence providers, and WHOIS domain information (respectively). The integrations are simple enough, but the volume of data and the number of external connections make the analysis insurmountable.
As an example of data volumes, I evaluated the logs generated by visiting amazon.com and searching for a product. In that two-click session, I generated over 900 web filtering events and visited over 200 unique domains – many of them ad trackers and hosting sites that remotely deliver the content on the page. Multiply these figures over a large enterprise with longer user browsing sessions, the data is enormous!
The Respond Analyst comes prepacked with integrations into relevant contextual repositories. We have engineered the Respond Analyst efficiently process high volume data sources like web filtering and network IDS/IPS and scale to meet the load of the largest financial, e-commerce, and industrial institutions. In the example above, it only takes one site delivering a malicious payload to compromise an internal asset!
In contrast, SIEM applications or playbooks are not designed to evaluate high volume data sources. In surveying our customers, most SOAR deployments are stagnated, and playbooks take months or years to be implemented. Also, the most commonly implemented playbooks are just for alert enrichment, which attributes a select few events with contextual information and still requires a human analyst’s evaluation and interpretation! This promise of automation only results in humans still being the bottleneck.