Development / Machine Learning / Monitoring / Security

Splunk Incorporates Machine Learning to Aid Security Monitoring and DevOps Workflows

28 Sep 2016 7:16am, by

IT analytics company Splunk is doubling down on Machine Learning.

The next versions of Splunk Enterprise, Splunk IT Service Intelligence (ITSI), Splunk Enterprise Security (ES) and Splunk User Behavior Analytics (UBA) will include custom machine learning-based predictive analytics, in both on-premise and the cloud versions.

Splunk Cloud and Enterprise 6.5 get a new interface to help you build your own machine learning (ML) models, along with ML tools to predict maintenance windows and help you forecast demand and react to changes by building models based on your own traffic and customers.

Splunk ES and UBA are predictive analytics tools and they will now learn what the baseline of normal behavior for your systems looks like so you’re not so swamped by alerts when everything is running smoothly that you miss the warnings for serious problems.

Splunk ITSI is already an ML-driven tool to help you find the root cause of problems and fix them faster; it gets new ML models to spot unusual events that could mean there’s a security or system problem.

“Both ITSI and UBA have machine learning models that are used to surface anomalies”, explains Splunk principal product manager for machine learning Manish Sainani. “ITSI is focused on key performance indicators, while UBA is focused on raw events and their sequences.”

“Machine learning can help detect, predict and prevent what matters most to an organization,” Sainani told The New Stack. “They can use it to help detect IT or security incidents, predict and prevent outages, forecast product inventories, and much more. Unlike human analysis, Splunk’s machine learning is always on – an important addition to their normal monitoring, operations and business analysis.”

“Typically, customers will use machine learning to detect anomalies, events or circumstances that do not fit normal patterns,” he said. “In IT that might be web server response times, or network congestion or many other infrastructure readings. More sophisticated customers might measure complex KPIs and IT services critical to the business. In security, they may look for anomalous user behaviors, systems communications, data transfers, or failed logins.”


Machine Learning for DevOps

Sainani suggests several DevOps workflows that Splunk’s machine learning is a good fit for:

  • Ranked root cause analysis for quicker resolution of issues, using clustering and prediction of categorical fields.
  • Outlier Detection using statistical methods to detect outlier across your key performance indicators (KPIs).
  • Adaptive Thresholds that adjust based on how your data is behaving so they automatically get updated to reflect changes in your data.
  • Anomaly Detection for both univariate (single) KPI and multivariate (multiple) KPI’s across your services.

Splunk uses three machine learning techniques: Clustering, which takes a lot of data and puts it into groups; classification, which produces a prediction; and regression, which uses historical values to come up with predictions about the future.

User Behavior Analytics uses those machine learning techniques for behavior baselining and modeling, anomaly detections (for which it has more than 30 models) and advanced threat detection. For both tools, you can also create your own custom analytics.

IT Service Intelligence uses machine learning for anomaly detection, adaptive thresholding and KPI management. It needs seven days of historical data for that detection to be statistically sound. “The algorithm by itself does not require more than two days’ worth of historical data,” Sainani told us, but Splunk decided on seven days of data for better accuracy.

“Once the baseline has been fed to the anomaly detection model, it can immediately start detecting and alerting on unusual patterns it hasn’t seen before.” And if your systems are already compromised, he claims “the algorithm is robust enough to avoid being affected by it.”

If you want custom machine learning models for working with the data you have in Splunk Enterprise (which already offers more than 20 machine learning commands), the new ML Toolkit also lets you work with open source Python libraries (scikit-learn, statsmodels, pandas, numpy, scipy) that include over 300 algorithms, Sainani told us.

These algorithms can be applied directly to the data for detection, alerting or analysis for specific use cases, whether for IT or security. The ML Toolkit also provides a guided workbench for data scientists to build their own models, Sainani said.

Anomaly detection

Anomaly detection.

The interface guides you through creating custom machine learning analytics with interactive examples. “With a single click they can deploy models into production to help detect IT or security incidents, predict and prevent outages, forecast product inventories, and much more. The biggest differentiator that the ML Toolkit brings is the ease with which a customer can build a machine learning model and put it into operation leveraging Splunk’s alerting and scheduled search framework.”

ML is becoming an increasingly useful security and analytics tool, and it’s a good fit for Splunk’s existing visualizations, believes Jason Stamper, data platforms and analytics analyst at 451 Research. “With a broad integration of machine learning, Splunk provides a comprehensive answer to one of the biggest challenges facing modern organizations: how to harness diverse, prevalent and increasingly profuse amounts of data to gain valuable business insights.”

Images: Splunk.

A newsletter digest of the week’s most important stories & analyses.