Machine Learning / Security

EvilModel: Malware that Hides Undetected Inside Deep Learning Models

4 Feb 2022 12:00pm, by

A team of researchers from the University of California, San Diego, and the University of Illinois has found that it is also possible to hide malware in deep learning neural networks and deliver it to an unsuspecting target, without it being detected by conventional anti-malware software.

Not surprisingly, this new work is highlighting the need for better cybersecurity measures to counteract and protect users from the very real possibility of AI-assisted attacks, especially as individuals and businesses become increasingly reliant on AI in their daily activities.

In a pre-print paper outlining EvilModel — the team’s ominously named method for embedding malware in deep learning neural networks — the team discovered that it was possible to infect a deep learning model with malware, and have it fool anti-malware detectors, all without significantly affecting the model’s performance.

To achieve this, the team used an approach known as steganography, where pieces of data in a system are swapped out for other bits of data that might have a hidden message or function. To hide their sample piece of malware, the team first started by deconstructing the malware into smaller pieces so that each piece measured only 3 bytes — an insignificant enough size to evade detection.

Detection of these altered parts of the model were made all the more difficult because deep learning AI models are constructed using multiple layers of artificial neurons, which in turn can consist of millions of parameters that interconnect between layers. Generally, mainstream deep learning frameworks like PyTorch and TensorFlow use 4-byte long floating-point numbers to store parameter values. As the team discovered, it was possible to replace 3 bytes of a parameter with a piece of malware code instead, so that the malicious payload can be embedded without significantly affecting the model’s performance.

“When neurons are replaced by malware bytes, the structure of the model remains unchanged,” noted the team. “As the malware is disassembled in the neurons, its characteristics are no longer available, which can evade detection by common anti-virus engines. Since the neural network model is robust to changes, there is no significant loss in performance.”

Workflow for EvilModel

The team’s experiments showed that it was possible to hide at least 36.9 megabytes of malware in their deep learning model, with only a drop of 1% in accuracy. In particular, the team’s experiments focused on convolutional neural networks (CNNs), a type of deep learning model that are used in a wide range of applications, including image classification, image processing and image recognition.

For the study, the team tested their method on an array of popular CNNs, including AlexNet, VGG, Resnet, Inception, and Mobilenet. CNNs are ideal for covert delivery of malware, as they encompass many diverse types of layers, as well as millions of parameters. In addition, many CNNs come pre-trained, meaning that some users may download them without knowing exactly what may be embedded within the model.

“In fact, we found that due to the redundant neurons in the network layers, changes in some neurons have little impact on the performance of the neural network,” explained the team. “Also, with the structure of the model unchanged, the hidden malware can evade detection from antivirus engines. Therefore, malware can be embedded and delivered to the target devices covertly and evasively by modifying the neurons.”

While this scenario is alarming enough, the team points out that attackers can also choose to publish an infected neural network on online public repositories like GitHub, where it can be downloaded on a larger scale. In addition, attackers can also deploy a more sophisticated form of delivery through what is known as a supply chain attack, or value-chain or third-party attack. This method involves having the malware-embedded models posing as automatic updates, which are then downloaded and installed onto target devices. It is devastating, and it works: it is this method of attack that was behind the massive US government data breach in 2020.

The team notes, however, that it is possible to destroy the embedded malware by retraining and fine-tuning models after they are downloaded, as long as the infected neural network layers are not “frozen”, meaning that the parameters in these frozen layers are not updated during the next round of fine-tuning, leaving the embedded malware intact.

“For professionals, the parameters of neurons can be changed through fine-tuning, pruning, model compression or other operations, thereby breaking the malware structure and preventing the malware from recovering normally,” said the team.

Additionally, another possible way to ensure the integrity of deep learning models is to only download them from trusted sources, as well as implement improved systems for verifying updates to avoid supply chain attacks. Ultimately, the team points out that there is a growing need for heightened security and better practices around the machine learning development pipeline.

“This paper proves that neural networks can also be used maliciously. With the popularity of AI, AI-assisted attacks will emerge and bring new challenges for computer security. Network attack and defense are interdependent. We believe countermeasures against AI-assisted attacks will be applied in the future, thus we hope the proposed scenario will contribute to future protection efforts.”

Images: Michael Geiger via Unsplash; University of California, San Diego, and the University of Illinois.