AI Model Extraction: Understand & Prevent IP Theft

Artificial Intelligence Model Extraction

06 Feb 2024 By Anastasia Angou

ML security

Machine learning models are the results of highly complex computations and optimization over a massive amount of data. Data is the fuel of a machine learning algorithm — collecting, cleaning, and formatting it is often the hardest part of the process. Whether it’s due to privacy regulations, missing sensors, or fragmented records, gathering usable data is difficult and expensive.

Training a model is expensive. Its accuracy depends heavily on the quality of your data and your technical know-how — both of which constitute valuable intellectual property.

Once trained, models are deployed in real-world applications — either in the cloud, on-premise or on edge devices, such as smartphones or IoT devices. A model deployed on a cloud acts as a black box for users: they send input and observe the output but its internal logic remains completely opaque. However, model deployed on-device or on-premise are vulnerable to model extraction through reverse-engineering.

What are the purposes of stealing a model?

There are three key motivations to access your model’s architecture and weights:

Stealing Intellectual Property

It can be an act of industrial espionage where the objective is to steal your IP and acquire your know-how at a fraction of the cost. This can also affects your competitive edge. Training a highly accurate model requires significant investment, that can offer decisive strategic value and high benefits. That makes it a prime target for attackers.

Avoiding Licensing Costs

If you monetize your model with a license, attackers may try to clone it to use unauthorized copy. In other words, they want the same output for free. In this case, protecting your model is a way to protect your revenue.

Enabling More Advanced Attacks

With access to the model’s internals (weights, layers, etc.), attackers can shift from black-box attacks to white-box attacks. This makes it easier to:

Create adversarial examples (inputs designed to fool your model)
Run model inversion attacks (reconstructing private training data)

How easy is it to steal your models?

As we mentioned earlier, models can be deployed either in a cloud, on premise or in an edge device.

In the case of cloud deployment, users access the models through a distant API. They send a request to the server, and the server returns a response. This gives more control to companies: the model stays on servers, and only the predictions are exposed. This setting is often called MLaaS for Machine Learning as a Service.

However, it is still possible for attackers to send requests repeatedly, analysing the outputs, and train their replica of the model. This is called distillation. The stolen model may differ in its internal structure (architecture, weights, etc.) but still produces the same outputs given the same inputs.

This attack has several benefits: it allows attackers to get an accurate model without having to gather a massive amount of training data. In practice, for this attack to be efficient and reach the same level of service, the attacker needs to have access to data presenting the same distribution as the original training data set. This works well for model trained over public data, but will not produce a model as efficient as the victim model when the training data set is proprietary.

In the case of on-premise or on-device deployment, models run locally inside an app or device. As described in our previous post, this offers reduced latency and better privacy, but it also facilitates attacker’s task.

In this setting, attackers have full control of the environment, allowing powerful reverse-engineering attacks. The objective is to extract the software’s implementation and assets, including the machine learning models. Attackers do not reconstruct a similar model to yours, they directly copy your model. This kind of attacks can be performed in a couple of hours, and do not require access to computational capacities or data.

Unfortunately, current software protection are easy to bypass when it comes to artificial intelligence. That’s where Skyld comes in.