Why Using a Time-Series Database Improves Security Monitoring
InfluxData sponsored this post.
Security monitoring is complex. Successful implementation of a security monitoring infrastructure involves people, process, technology and data, and requires multiple iterative phases to reach maturity. Security data comes from multiple sources and the prevailing method at the time of this writing is to acquire security data by consuming log files from every possible asset, (be it an application, database, virtual machine, container, microservice, operating system, server, network component, storage and even intelligent power strips) and then sending that data to a SIEM or log management system such as Splunk, SumoLogic, or Elastic.
That’s not to mention the digital supply chain. In the past decade, you’ve shifted from primarily on-prem software to primarily cloud-based services and SaaS applications. And a lot of them, too — one survey shows that a typical company can use anywhere from 100 to 300 SaaS applications, depending on company size. And when you look at the growth in the SaaS market landscape for just one portion of the enterprise (marketing) over the last decade, from a few hundred vendors to several thousand, this trend shows no sign of slowing — that means security monitoring has to cover a lot of cloud services and SaaS apps.
This shift in digital supply chain requires a shift in security monitoring practices. When software was on-prem, you could almost always count on being able to access log files (which were typically available at no charge) and send them to your SIEM or log management system.
Cloud and SaaS Make Security Monitoring Difficult
With cloud services and SaaS apps, you will not have direct access to the log information and many services do not provide indirect access. Even when access is available, it’s not easy and usually costs extra. So, you need to move from analyzing on-prem logs to collecting security-related events from cloud services through APIs — assuming such APIs are even available.
Whether you’re running software on-prem, as a cloud service, or some combination of the two, a crucial challenge is to identify all your software assets. This informs and describes the attack surface you’re trying to monitor and protect. Once you know your asset portfolio, the tooling to enable an ongoing discovery of new assets becomes a requirement given the dynamic nature of asset instantiation enabled by cloud and related technologies. When you have a sense of the assets in scope, the instrumentation phase begins — including adding enhanced logging capabilities to the primary assets (applications) in scope.
Application Event Logging Is Crucial
It’s important to remember that, of the morass of information recorded across multiple assets in log files and event APIs, only a subset of events are relevant for security monitoring, since they provide much greater insight than infrastructure logging alone. I’m talking about custom application events. Unfortunately, application event logging is often missing, disabled or poorly configured, meaning that security teams have a blind spot where they need visibility the most.
The reason application logs are invaluable is that they help to:
- Identify security incidents
- Monitor policy violations
- Establish baselines of “normal” behavior
- Assist non-repudiation controls, to provide proof of the origin and integrity of security event data
- Provide information about attacks, breaches and other unusual conditions
- Facilitate incident investigation which is lacking in other log sources
- Identify security vulnerabilities
- Defend against vulnerability exploitation through attack detection
This guide from OWASP further describes how to set up application event logging.
Once you’ve begun to collect your myriad log files, document your security events from APIs and have appropriately instrumented your application security events, the real work can begin — mining for potential security events and anomalies to inform the alerting and incident response process.
Time-Series Databases Enhance Security Monitoring
This is where a time-series database becomes a critical and natural solution by converting all your log data and security events to collections of time series. Doing so enables you to quickly correlate time-series events across dependent or connected assets, articulate the indicators and trace the vector of compromise. In turn, this enables faster incident detection, response, remediation and forensics workflows.
Traditional log management-driven security monitors require massive storage subsystems which contain mostly unnecessary data — the noise factor inherent in log data — and thus simply cannot perform at the scale and speed necessary for effective monitoring and response.
In contrast, time-series databases normalize security event data at data ingest into an efficient, standardized format allowing you to store security data economically and index on multiple attributes to enable fast searches. For example, some time-series databases typically measure their query response times in tens of milliseconds. Given this efficient data format, you can store more events with less budget spending.
Time-series databases are well-suited for tracking security metrics. There are many such metrics one can track, such as number of:
- Authentication attempts over time
- Unsuccessful authentication attempts over time
- Successful authentication attempts over time
- Unique accounts over time
- IP addresses per account over time
- Accounts per IP address over time
- Privileged operations over time
One can use machine learning to build a behavioral model of typical usage and then look at real-time events that indicate deviations from this model. Time series databases have the ability to apply advanced algorithms for anomaly detection, such as Median Absolute Deviation (MAD), Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH), or Naive Bayes Classifiers.
Despite these advantages, time-series databases are generally not used in security monitoring applications (such as SIEM, or intrusion detection and prevention systems) though the security community has begun to explore this approach. I believe using time series databases has merit.
To be clear, time-series databases should complement SIEM and other log-based security monitoring systems, not replace them. Ultimately, I’d expect to see SIEM vendors use time-series databases as an embedded component into their products.
One great example of a lightweight security monitoring application you can start using now that’s based on a time series platform is a community template recently contributed by our partner Bonitoo. This application performs monitoring for abusive IP addresses so they can be temporarily blocked if your application requires port 22 to be exposed for SSH access.
At our company, we’re implementing this concept. We’ve started to build security monitoring applications based on our own time-series platform, but we’re still early in our journey. You can read about our progress in this blog by my colleague Darin Fisher.
Featured image via Pixabay.