Data / Machine Learning / Security

Intezer Provides Code ‘DNA Mapping’ to Root out Malware

26 Feb 2018 10:21am, by

While a host of new security companies employ artificial intelligence and automation to detect anomalies and shut down cyber attacks, Israeli startup Intezer takes a different tack: analyzing the source of code itself.

Taking cues from the body’s immune system, CEO Itai Tevet calls it “DNA mapping for software,” and its goal of being the “Google of binary code” by providing a search engine for all the world’s code.

“We know that attackers are able to bypass any type of anomaly detection systems or next-generation systems by just being under the radar and being part of the noise, making themselves look legitimate,” he said.

“[Our technology] looks at thousands of small pieces of software we call genes. Using this gene-extraction algorithm, we created the first genome-coded database. This is essentially billions of small pieces of code from software, both legit and malicious, so from that database, we can tell where this code came from, how many times we’ve seen it before,” he said.

It’s rich data about tiny fragments of software. It looks at the machine code level, meaning it can differentiate between code from a trusted source, malware and code that has never been seen before.

“It can tell you this is a piece of code I’ve seen in Photoshop or this is a piece of code I’ve seen in the malware Zeus or if it’s a piece of code never seen before in any software in the world. It’s as if you had hundreds of reverse engineers looking at every line of code and telling you where they’ve seen it before,” he said.

Code reuse is the norm, such as libraries, both in legitimate software development as well as malware. Code that has never been seen before is a big red flag; though not necessarily malicious, it needs to be investigated, he said.

In the 2016 cyber attacks against the Democratic Party, 80 percent of the code had already been seen in attacks associated with the Russian government, he said. And Intezer claims to be the first to link code used in the WannaCry attacks to North Korea.

Deep Insight

Founded in 2015, Intezer has raised $10 million, most recently an $8 million Series A round led by Intel Capital.

Tevet formerly was head of IDF CERT, the Israeli Defense Force’s Cyber Incident Response team, which on a daily basis tried to thwart attacks from Russia, China and other nation-states. Co-founders Roy Halevi and serial entrepreneur Alon Cohen, co-founder of CyberArk, also came from IDF.

Intezer Analyze is a subscription-based SaaS product targeted to incident-response teams, SOC teams and managed security service providers in large organizations.

It’s designed to help large organizations deal with the daily deluge of security alerts, and address the cybersecurity workforce shortage, which Frost & Sullivan predicts will be a 1.8 million worker gap by 2022.

Most of the customers plug it into the security incident management (SIM) systems to gain deep insight into the source of the code as well as the risk to the business.

“When you have an alert, you need to determine what it is, what is the damage the damage to my company, what is the level of sophistication, what is the intention of the attacker, is it a banking Trojan, is it ransomware — all these questions can be answered in one second in the Tier 1 Secure Operations Center,” Tevet said.

It can scan computer memories and find malicious code that cannot be detected on disks. Even a small bit of code waiting for D-day to explode can be detected, even if it hasn’t done anything yet, he said.

The software can winnow down thousands of daily alerts into a handful prioritized according to urgency, sophistication and risk level.

With the community edition, users can drag and drop files to be scanned. The enterprise version provides automation; it can be integrated with any other tool for incident response.

By the end of the year, it plans to release a second product, Intezer Immune, which will offer real-time monitoring of all code in systems. It will require a single agentless on-premise component to cover all of your endpoints. To deal with constant code changes, and to save time and memory resources, it will analyze only the changes when they occur.

Another interesting startup building a source-code database, though not specifically security-related, source{d} mines a data set based on 57 million public Git repositories — hundreds of terabytes of source code — to train machine-learning models to understand natural language, intent and similarity.

source{d}, which originated in Madrid, is focused on applying machine learning on top of source code to help enterprises better manage their code base.

A newsletter digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.