AI collaboration platform Hugging Face could be susceptible to model-based attacks. Here's what we know so far.
In the realm of AI collaboration, Hugging Face reigns supreme. But could it be the target of model-based attacks? Recent findings from JFrog suggest a concerning possibility, prompting a closer look at the platform’s security and signaling a new era of caution in AI research.
Here’s what we learned:
Growing security concerns for AI models have been highlighted with the discovery of malicious functionalities within models on Hugging Face, showcasing the potential for targeted attacks.
Key points:
The discussion on AI Machine Language (ML) models security is still not widespread enough, and this blog post aims to broaden the conversation around the topic.
This area encompasses a wide range of security measures designed to safeguard the integrity, confidentiality, and availability of AI systems. As AI models become increasingly integral to various sectors, including healthcare, finance, and national security, ensuring their security is paramount. The unique challenges and potential solutions in AI model security include:
Unique Challenges
Data poisoning: Attackers can manipulate the training data to compromise the model’s performance or cause it to make incorrect predictions.
Model stealing: Competitors or malicious actors may attempt to replicate a proprietary AI model by querying it with inputs and using the outputs to train a similar model.
Adversarial attacks: Slight, often imperceptible alterations to input data can lead to incorrect outputs from the AI model, exploiting vulnerabilities in the model’s interpretative logic.
Privacy leaks: AI models, especially those trained on sensitive or personal data, can inadvertently reveal information about the training data through their outputs.
Supply chain attacks: Compromises in the supply chain can affect the integrity of AI models by introducing vulnerabilities at any point from development to deployment.
As with other open-source repositories, JFrog has been regularly monitoring and scanning AI models uploaded by users. One of its researchers – David Cohen – has discovered a model whose loading leads to code execution after loading a pickle file. The model’s payload grants the attacker a shell on the compromised machine, enabling them to gain full control over victims’ machines through what is commonly referred to as a “backdoor”.
JFrog implemented a sophisticated scanning technology to scrutinize models from PyTorch and Tensorflow Keras available on Hugging Face, identifying a hundred models exhibiting malicious characteristics.
“It’s crucial to emphasize that when we refer to ‘malicious models,’ we specifically denote those housing real, harmful payloads,” the JFrog report states.
“This count excludes false positives, ensuring a genuine representation of the distribution of efforts towards producing malicious models for PyTorch and Tensorflow on Hugging Face.”
Code execution can happen when loading certain types of ML models from an untrusted source. For example, some models use the “pickle” format, which is a common format for serializing Python objects. However, pickle files can also contain arbitrary code that is executed when the file is loaded.
A notable instance involved a PyTorch model, uploaded by a user called “baller423” and subsequently deleted from HuggingFace, which harbored a payload enabling the creation of a reverse shell to a predetermined host (210.117.212.93).
The nefarious payload leveraged the “reduce” method from Python’s pickle module to run arbitrary code when a PyTorch model file was loaded, cleverly bypassing security measures by inserting the malicious code within the serialization process’s trusted framework.
To gain additional insights into the actors’ intentions, JFrog established a HoneyPot on an external server, completely isolated from any sensitive networks. By mimicking legitimate systems or services, a HoneyPot can attract various types of attacks, allowing defenders to monitor and analyze attackers’ activities.
Further investigation into the IP address range belonging to KREOnet suggests potential ties between the authors of these models and researchers or AI practitioners.
To mitigate risks associated with malicious models and potential code execution vulnerabilities, data scientists should adopt proactive measures such as source verification, security scans, safe loading methods, updating dependencies, reviewing model code, isolating environments, and educating users.
Hugging Face, a platform for AI collaboration, has implemented several security measures such as malware scanning, pickle scanning, and secrets scanning to prevent these attacks. These features scan every file of the repositories for malicious code, unsafe deserialization, or sensitive information, and alert the users or the moderators accordingly. Despite these measures, recent incidents serve as a stark reminder that the platform is not immune to real threats.
Each new vulnerability is a reminder of where we stand and what we need to do better. Check out the following resources to help you maintain cyber hygiene and stay ahead of the threat actors: