SIEM vendors claim to provide machine learning functionalities in their solutions. Gartner recently covered the growing arena of Machine Learning Log Analysis, and how it is being positioned as a complement to SIEM. What do CISOs and security directors need to look for to effectively navigate ML in their security platform?
Machine Learning (ML) is
a branch of Artificial Intelligence (AI) which uses algorithms and statistical models to perform
tasks without using explicit instructions, relying on patterns and inference instead. In
the security arena, ML’s purpose is to take raw log and event data and turn it
into actionable intelligence about security events. Like other forms of AI, ML
is instrumental in bringing more intelligent automation to the cumbersome and
complex challenge of managing security systems.
products used (or claimed to use) ML to detect behavioral anomalies that might
lead to a security incident – typically calling it “detecting unknown attacks.”
In the past few years more and more cybersecurity operations initiatives are using ML to help automate the management of security tools, deriving much more value out of them, and out of the data they are generating, and fast. It’s all about time to value from your security tools, or in other words, shortening the time to respond to real attacks.
Gartner Defines the
ML Log Analysis Arena
Gartner’s Security Research Director Eric Ahlm published a report titled “Emerging Technology Analysis: Machine Learning Log Analysis
Disrupts Traditional SIEM Buying Models” in October 2019. The report highlights some ways in which this
new arena of ML Log Analysis is challenging the SIEM arena, and leading
organizations to redirect some of their SIEM budget to ML Log Analysis.
While the report includes recommendations for ML Log Analysis
vendors on how to position their offerings, the question remains: how should
buyers view this new arena, and how to judge if their SIEM requires a “wingman”
in the form of ML Log Analysis?
The Gartner report positions ML Log Analysis everywhere on
the spectrum from reinforcement for SIEM (with added budgets) to full SIEM
replacement. “An ML-based log solution can augment functionality, help scale
data or operations, or in some specific cases out right replace an existing
SIEM,” states the report.
Example providers listed in the report include empow, Uplevel Security and others.
Drivers for ML Implementation
What drives some organizations to add ML log analysis to SIEM
in their platforms?
One driver is a common SIEM disease: license cost creep. Most SIEM pricing models are based on the
amount of data – the less data they will need to digest, the lower the cost.
Therefore, some will choose to add an ML log analysis intelligence layer in
front of their SIEM, which will potentially crunch some of the data, and create
fewer alerts which the SIEM will need to digest. This of course can be a good
approach and will fit some end users’ needs.
However for some users, this approach will not work because regulation
mandates them to keep any and all pieces of raw log.
A second driver for adding ML log analysis to an existing
SIEM is identified in the Gartner report under the headline “Scale”: Scaling
investigators through alert reduction and accuracy, scaling knowledge through
predictive analysis and scaling response time normally required for lookup,
manual data linkage or search tasks.
To achieve improved scaling, an ML intelligence layer is
integrated into the SIEM’s data repository or sits on top of it. Its promise is
to analyze the collected data and remove false positives and noise through
automatic investigation actions, prioritizing the most relevant data that the
user should focus on.
ML Value Criteria – what have you done for me lately?
The main question security teams should ask themselves is not
‘do we have ML’, which is so generic as to be almost meaningless, but rather: Is
the ML technology in my network giving me the benefits I want, as per the
drivers outlined above – cost and scalability?
When evaluating solutions that promise AI and ML – whether as
part of a SIEM or an independent ML Logs Analysis software – we need to again
look for the BENEFIT, and ask ourselves the following questions:
this ML functionality enable me to meet the data volume needs of my
the AI utilizes, for example, a supervised ML process, then what industries was
the training data taken from? Different industries (banking, retail, insurance,
manufacturing, etc.) experience different types of security events that can
impact the effectiveness of ML. Make sure the data sets used are associated
with your industry, or at least with a similar one.
are the security domain experts who provided feedback for the algorithm’s creation
process? If the solution doesn’t employ the right domain experts to optimize ML
algorithms, the ML will remain a theoretical exercise.
frequently does the algorithm need to be retrained to maintain its
effectiveness? How is the system updated with retrained machines?
And most importantly, the evaluation criteria should be the
based on how well it helps you to meet ypur goals, or drivers:
- If my driver is cost, then show me percent of data reduction.
- If my driver is scalability or automating logs investigation for achieving a shorter time to response, then show me trends of these KPIs as part of my cyber operations.
Whether from SIEM or from ML log analysis platforms, if you
aren’t getting the benefits you need from ML, its time to continue the search.
Avi Chesla, Founder & CEO, empow