A transition’s under way in the use of big data to improve security, according to Leo Meyerovich, co-founder of Graphistry, an Oakland, Calif.-based startup focused on using visual graph technology to improve security breach investigation.
In the beginning, it was all about just loading in all that data and maybe you could write some queries on it. But that was overwhelming to analysts. What queries do you write?
After that, many companies set out to use machine learning techniques to pick out the most important alerts and prioritize them. “Machine learning is like treating the symptoms. It’s kind of a better way to raise alerts,” Meyerovich said, making a comparison to the way a doctor diagnoses an ear infection.
“Honestly, that stuff is commodity technology at this point. There are some cool AI startups, but in the end, they’re all doing random forest and deep learning … Graphistry is the next generation after that,” he said.
Graphistry is focused on the investigation aspect: working with 100 times or 1,000 times more data and presenting results that are useful to everyday security teams, the company claims…
GPUs in the Browser
Meyerovich and co-founder Matt Torok came from the University of California Berkeley, where they worked on the parallelizing web browser operations. Torok also created the Superconductor language for GPU visual analytics.
Graphistry represents an expansion of their parallel browser work.
“One of the insights that kind of started it was that we can use GPUs in the browser, get a lot of kind of rendering stuff, then connect that with a bunch of GPUs in the data center, which is now commodity, and get within 100 milliseconds for the whole round trip. And that enables a new class of software,” Meyerovich explained in a video of a New York Enterprise Technology meetup:
“Imagine something like Google Maps or Netflix, where it’s kind of bouncing back and forth between them. What was exciting for us was saying [that] rather than your visual experience is just your phone or your laptop, it’s a supercomputer in your pocket or your desktop and a supercomputer in the data center that’s probably even bigger,” Meyerovich said in an interview.
“Once we realized we could start looking at so much more data at the same time — imagine a second of interactivity with millions of samples — one of the best uses of this is security or more broadly, any type of event analysis.”
At that meetup, Meyerovich demonstrated Graphistry based on one day of attack alerts from a security information and event management (SIEM) taken from a customer’s system.
For the demonstration, Meyerovich spun up a Hadoop cluster, ran a Spark SQL query, and for that one day, got 130 million events across the cluster. Looking at all the priority 10’s (the most urgent priority), the data was reduced to about 880,000 events, of which about 840,000 were the firewall.
The software showed every data point, represented across a graph. “This is kind of like Google Maps on the client. There’s very direct interaction. In this view, every device on the network is a point,” Meyerovich explained.
The analyst can investigate what else happened on those nodes of interest and what else happened after. He can use color coding to deal with buckets of events by categories. Graph views quickly reveal the scope of events, as well as progression, correlations and outliers, according to the company.
In the video, Meyerovich explained its technology this way:
“We plug into all your Big Data stacks, if you have Splunk,” Meyerovich said. “We’re not going to add more alerts to somebody’s workload,” he said in the interview. “We’re going to help them more efficiently work with them.”
Graphistry gathers context from all the systems — databases, mail servers or any other API-exposed system — and drop it into the analytics engine. Then it graphically presents all that event activity to the analyst so he or she can correlate better views of what actually happened.
“If you have somebody doing an investigation, we’re looking at where are the data bottlenecks. Why does it become unreliable?” he said.
Visual playbooks are a second piece of technology on which Graphistry is focused. In effect, users can document the workflows for the events they’re most commonly investigating.
“We found that even if we could show all that data, people didn’t necessarily have the right data loaded in. Or maybe you have the right data loaded in, but you’re a senior analyst and you just joined the job and don’t know the environment. Or you’re a junior analyst and you don’t know any of these things,” Meyerovich said.
“We found that the ways they gather the data, connect the data and work through the data — a lot of the use cases are kind of similar. That’s one of the bottlenecks, all this institutional knowledge that people are creating,” he said. “It’s kind of like dashboards-plus-plus. It’s a way to write down those multi-step investigations across multiple systems.”
Visual playbooks provide a visual analytics session mirroring the notion of a runbook with analytics.
“Rather than visual analytics becoming an exploratory and ad-hoc sort of thing, it becomes a reliable, day-to-day thing. Rather than having data available to people who write code all day, it becomes available to regular analysts,” he said.
And it becomes similar to code coverage in the software engineering world about making your processes safe automatically.
“We’ve been applying that idea to security. With the playbooks, you can ask when an issue comes in, are there automated procedures to handle it? What percentage of your daily or weekly incidents are covered? As we’re looking at these investigation domains — security, fraud, etc. — we help teams increase the amount of automation they use to investigate,” he said.
Graphistry also is part of an initiative called GoAI (GPU Open Analytics Initiative) that’s enabling different types of GPU analytics to work together.