The Eight Fragments of SIEM
Security Information and Event Management (SIEM) was declared dead more than a decade ago, and yet it is still widely deployed in many security programs. We are used to thinking about SIEM as a monolithic platform capability, but it isn’t anymore. It can be broken down into eight different discrete capabilities, each of which is evolving and innovating at a different rate. This will drive fragmentation of traditional SIEM, however, this is probably a good thing for the effectiveness of our security detection and response programs.
To simplify their vendor management and technical complexity, many security teams will only buy a new security tool if it eliminates other existing tools from their portfolio. This is the primary counter-argument against fragmentation of SIEM capabilities. Here’s the question we have to answer, “Is it more important for information security to increase our capability to detect and respond or simplify our tool portfolio?”
Traditional SIEM software solutions vendors will always struggle to balance innovation and investment across so many different capabilities. These vendors attempt to differentiate by bringing a greater capability in a few areas, or by using a peanut butter approach to achieve a lower but uniform capability across the board.
This presents a difficult set of business decisions, typically driven by the available budget to invest. Large companies are in the business of earning profitable revenue from the sale of these products, while small startups are in the business of innovating and delivering new capabilities leveraging focused growth investment.
Here are the eight fragments of SIEM:
1. Data Collection and Normalization
The first thing that a SIEM does is collect data from originating data sources and parse it into a common format. Structured data is far easier to understand for logic automation than unstructured data. Data volume, velocity, and variety are bread and butter for big data (plumbing) providers, like Hadoop or ElasticSearch, and yet parsing security data into a workable format is still the province of SIEM solutions technologies. There are many new concepts emerging every day in big data management and dedicated big data platforms. Open source projects keep up with these improvements more effectively than SIEMs, so it feels as if SIEMs will eventually get out of the plumbing business.
“Context is King.” Once data has been collected into a SIEM platform, understanding that data in its full context is critical to effective detection and response. This includes understanding internal assets, external threat intelligence, internal IT operations, and events patterns. An IP address or hostname are only minimally informative, but if they can be described as a critical asset running a point-of-sale application then we understand how important they are for our business. There is fragmentation even within this category, along the user, network and asset divides. UEBA, NAC, IAM anyone? Each is a challenging problem in itself.
3. Detection Logic
Once data has been collected and context has been added to it, we can apply analytical or detection science to identify those events that require further response. This area of the market is innovating and evolving at an asymptotic rate. For many years Boolean logic was the limit available to identify events of interest. It was used to simply “funnel” the event volume down to a manageable amount by alarming on specifically described security scenarios. These scenarios were sometimes called correlated events, but more often were a narrowly defined common security situation like multiple failed logins or only high severity IDS alerts.
With the advent of advanced analytical techniques, machine learning and artificial intelligence, the quality of logic that can be applied to every security event has exploded. Given that time to detection is often measured in months or years, anything that can reduce this is of critical importance to the success of our security programs.
Once that logic was applied, an event or correlated event was displayed on a console for an analyst to evaluate. This key bottleneck reduced what was possible to look at in terms of security events down to 0.0001% or less of the total security events generated. In addition, human factors around monitoring alarms at scale resulted in many missed detections. The idea that one more alert is going to help an analyst make a decision is a false one, and so the console truly is dead. One recent attempt to replace the console centered on the use of dashboards to summarize security information, but this was even less effective than being buried in correlated events, as you were buried in summary dashboards.
Once some form of logic has enabled an analyst to determine an ongoing incident might require action, the system provided basic workflow management for the events of interest to be moved through an analytical process and assembled into a case for an Incident Responder. Many incident response teams would then transition to a dedicated IR case management solution and away from the SIEM’s workflow. Measuring these steps and understanding all the long-poles in time-to-resolution is very important for our ability to improve our operations. This is an area that needs more innovation and glue to connect all the actions taken on the way to resolution.
6. Case Management
A case is simply a container for all events, context and analyst descriptions of a single incident. A case allows an incident responder to rapidly decide whether they should continue their investigation and what the priority of that investigation should be. It also provides a formal record of investigation as it is conducted and a forensically sound process for incident handling. While many SIEM solutions contain a small case management function, the advent of ServiceNow and IBM Resilient have radically innovated beyond standard SIEM.
7. Automated response
All basic SIEM tools provide external integrations — think right-click tools. These can be the gathering of additional investigative information, sometimes called decorating the alert, or actions to halt attackers by imposing a firewall or IPS blocks in near real-time. Many companies have conducted experiments inevitably resulting in “self-denial of service” by attempting to design rapid blocking techniques directly from within the SIEM. There is now a new category of vendors called security automation, orchestration, and response who are innovating in this space and positioning themselves as downstream of the SIEM for response automation but really are mostly upstream in alert decoration, at the moment.
8. Forensic Data Lake and Search
Another critical functionality of most SIEMs was the maintenance, preservation, and availability of security logs for forensic analysis (meaning post-incident detection) and for the purposes of hunting for novel incidents. This capability relies heavily on the speed of data retrieval, and is generally dominated by columnar or parallel data stores — think Apache Spark or Vertica. Since the meantime to detection is 3/4 of a year, we need faster access to much more data than ever before if we are going to hunt where the attackers are located in time. No SIEM can provide this without extraordinary costs associated and this is another avenue for the big data solutions to outperform traditional SIEM.
There is so much innovation and speed in the security market in each of these categories that it is difficult to ignore this fragmentation and blindly continue with a single monolithic platform. Ultimately, we are paid to protect our companies and customers, to defend them on an increasingly hostile Internet where the consequences of failure continue to grow exponentially. This means we cannot afford to ignore the innovation around the eight fragments of security information and event management.