Voyager18 (research)

Hugging Face, Wiz, and the risks of AI-as-a-service

Uncover AI-as-a-Service security insights through Wiz and Hugging Face, highlighting vulnerabilities and necessary security measures.

Yair Divinsky | April 8, 2024

The rapid adoption of AI technology is unprecedented. With more organizations globally embracing AI-as-a-Service, also known as “AI cloud,” the industry needs to acknowledge the potential risks in this shared infrastructure that houses sensitive data. It should enforce mature regulations and security practices akin to those applied to public cloud service providers. 

Moving swiftly often leads to disruptions. Recently, Wiz Research collaborated with AI-as-a-Service firms to uncover prevalent security risks that could impact the industry, potentially endangering users’ data and models. 

This blog outlines the collaborative efforts made by Wiz together with Hugging Face, a prominent AI-as-a-Service provider experiencing rapid growth to meet escalating demand. The discoveries not only prompted Hugging Face to bolster platform security, which they successfully did, but also highlighted broader insights applicable to various AI systems and AI-as-a-Service platforms. 



Wiz’s State of AI in the Cloud report reveals that AI services are already integrated into over 70% of cloud environments, underscoring the significant impact of these findings. 

This blog outlines the collaborative efforts made by Wiz together with Hugging Face, a prominent AI-as-a-Service provider experiencing rapid growth to meet escalating demand. The discoveries not only prompted Hugging Face to bolster platform security, which they successfully did, but also highlighted broader insights applicable to various AI systems and AI-as-a-Service platforms. 



Wiz and Hugging Face’s collaboration exposed vulnerabilities in AI-as-a-Service, highlighting risks with untrusted AI models like Pickle, which could lead to arbitrary code execution and cross-tenant access. The findings stress the need for strong security practices and constant vigilance in cloud-based AI integrations.

Background information: AI-as-a-service

AI models require powerful GPUs, often using services like Hugging Face’s API, similar to running applications on AWS/GCP/Azure.

Wiz Research compromised Hugging Face by injecting a malicious model and breaking into other users’ spaces, revealing a cross-tenant access flaw.

This issue, likely not isolated to Hugging Face, points to broader challenges in AI service security and the need for collaboration on secure yet scalable infrastructures.

Additionally, Hugging Face team have also posted a blog post about the collaboration and partnership made in the research. They have published their own blog post responding to Wiz’s research, providing insights and outcomes from their viewpoint. 


What is Hugging Face?

Hugging Face stands as a leading open platform for AI developers, focused on democratizing Machine Learning. It provides vital infrastructure for hosting, training, and collaborative AI model development.

Additionally, it acts as a central hub for users to access community-created AI models, datasets, and demonstrations, all while emphasizing the importance of understanding AI/ML risks.


What is the AI-as-a-service threat?

Hugging Face’s central position in AI development makes it a prime target for attacks. A breach could expose private AI models, datasets, and critical applications, leading to considerable damage and potential supply chain risks.

The investigation revealed the risk of malicious models in AI systems, especially for AI-as-a-Service providers, where they can be used for cross-tenant attacks. Attackers exploiting this vulnerability could access a wide range of private AI models and applications. Wiz Research pinpointed two significant risks within Hugging Face’s platform that could be exploited.

  1. Shared inference infrastructure takeover risk: The study found a risk in AI inference infrastructure, where processing untrusted “pickle” format models could allow a malicious model to execute remote code, granting attackers access to other customers’ models.

  2. Shared CI/CD Takeover Risk: Compiling malicious AI applications represents another significant risk, as attackers could aim to hijack the CI/CD pipeline itself, leading to a supply chain attack. After gaining control of the CI/CD cluster, a malicious AI application could further exacerbate the risk. 




These findings underscore the critical need for AI service providers like Hugging Face to implement robust security measures to safeguard against such risks and maintain the integrity of their platforms. 


What types of AI/ML applications exist? 

Various AI/ML applications possess distinct characteristics and scopes, necessitating a nuanced approach to security considerations. A typical AI/ML application comprises several key components: 

  1. Model: The AI models in use, such as LLaMA, Bert, Whisper, among others. 
  2. Application: The code of the application responsible for feeding inputs to the AI model and utilizing its predictions. 
  3. Inference Infrastructure: The infrastructure enabling the execution of the AI model, whether it’s on edge (such as Transformers.js), through an API, or via Inference-as-a-Service (such as Hugging Face’s Inference Endpoints). 

Potential adversaries may target each of these components using varied methods. For instance: 

  • Attacking the AI model directly involves manipulating inputs to induce false predictions, as demonstrated by adversarial.js. 
  • Targeting the application involves using inputs that generate correct predictions but are handled unsafely within the application, potentially leading to vulnerabilities like SQL injection attacks. 
  • Attacking the inference infrastructure can be achieved through a specially crafted pickle-serialized malicious model. Given the common practice of treating AI models as black boxes and using publicly available models, there’s a dearth of tools to verify a model’s integrity against malicious code (e.g., Pickle Scanning by Hugging Face). Developers and engineers must exercise caution when sourcing models to avoid introducing integrity and security risks akin to including untrusted code in their applications. 

To demonstrate this, let’s look at a specialized serialization exploit to gain access to Hugging Face’s infrastructure and discuss measures to mitigate such risks effectively. 



Security queries and findings for AI

Wiz’s research into cloud isolation vulnerabilities, particularly for AI security, raises concerns about AI-as-a-Service and potential exploitation for cross-tenant access. The critical questions are: How isolated are AI models operating on these platforms, and how effective is this isolation.

The research delved into three pivotal aspects of the platform: 

  • Inference API: This feature enables the community to explore and experiment with available models on the platform without the need to install necessary dependencies locally. Users can interact with and preview these models via a modal on the platform, powered by the Inference API. 
  • Inference Endpoints: Hugging Face’s fully managed offering allows users to deploy AI models on dedicated infrastructure tailored for production purposes, essentially functioning as an Inference-as-a-Service solution. 
  • Spaces: This functionality provides a straightforward method for hosting AI/ML applications, facilitating the showcasing of AI models or collaborative development of AI-powered applications. 


ai-as-a-service attack


Researching Hugging Face Inference API and inference endpoints

Investigating Hugging Face’s Inference capabilities, such as the Inference API and Endpoints, researchers noted that users can upload custom models, with Hugging Face automating the setup for interaction and predictions. This raised a crucial question:

Could a user upload a crafted, potentially malicious model to execute arbitrary code within this interface? If so, what insights or vulnerabilities could this expose?


Submitting a malicious model to the platform

Hugging Face’s platform accommodates a range of AI model formats, with a notable emphasis on two formats: PyTorch (Pickle) and Safetensors, as evidenced by a quick search on their platform. 




It’s widely acknowledged that Python’s Pickle format carries inherent risks, including the potential for remote code execution upon deserialization of untrusted data. This caution is echoed in Pickle’s official documentation: 


Given Pickle’s vulnerability, Hugging Face conducts analysis, such as Pickle Scanning and Malware Scanning, on Pickle files uploaded to their platform. They flag and caution users about potentially hazardous models. 



Despite identifying dangers, Hugging Face permits users to perform inference on uploaded Pickle-based models using the platform’s infrastructure. This allowance stems from the ongoing usage of PyTorch pickle within the community, necessitating support from Hugging Face. 

Despite recognizing risks, Hugging Face allows users to run inference on Pickle-based models using its infrastructure, due to PyTorch pickle’s popularity within the community.

Researchers aimed to test the outcomes of uploading and interacting with a malicious Pickle model through the Inference API:

Would the malicious code run? Would it be in a sandboxed environment? And, are these models sharing infrastructure with other Hugging Face users, essentially making the Inference API a multi-tenant service.


Executing remote code via customized Pickle file

Simplifying, it’s easy to craft a PyTorch (Pickle) model that executes arbitrary code upon loading. Wiz researchers did so by cloning a well-known model like gpt2, including essential files like config.json, which inform Hugging Face how to operate it.

They then tweaked this model to trigger a reverse shell upon loading. Subsequently, they uploaded this customized model to Hugging Face as a private model and tested it via the Inference API feature. As anticipated, the reverse shell functionality was successfully triggered. 



Wiz Research streamlined their testing by creating a malicious model that simulates a shell. They intercepted Hugging Face’s Python functions that manage model inference results (after Pickle deserialization, during code execution), enabling a shell-like interface. Here are the outcomes observed:


Upon encountering the malicious predefined keyword (Backdoor), the model executes a command: 



Amazon EKS Privilege Escalation via IMDS 

After executing code within Hugging Face’s Inference API and obtaining a reverse shell, researchers found they were in a Pod within a Kubernetes cluster on Amazon EKS.

Their frequent encounters with Amazon EKS over the last year, during security research on service providers, led to the development of a playbook for identifying signs of an EKS cluster. These insights are elaborated in the 2023 Kubernetes Security report.

The investigation revealed the ability to access the node’s Instance Metadata Service (IMDS) at from their pod. By querying IMDS for the node’s identity, researchers could determine a Node’s role within the EKS cluster using the aws eks get-token command. This issue, common in Amazon EKS, was previously noted in Wiz’s EKS Cluster Games CTF (Challenge #4).

However, generating a valid Kubernetes cluster token required the correct cluster name for the aws eks get-token command. Initially unsuccessful in guessing, the team found their AWS role had DescribeInstances permissions, a default setting that exposed the cluster name through a node tag.



Finally, using the aws eks get-token command and the IAM identity obtained from the IMDS, they successfully generated a valid Kubernetes token with the role of a Node. 



With the Node role within the Amazon EKS cluster, they were then able to gained enhanced privileges, enabling them to further explore the environment. 

One of the actions involved listing information about the Pod where they were operating using kubectl get pods/$(hostname), followed by inspecting the associated secrets. Their demonstration showed that by accessing secrets (via kubectl get secrets), lateral movement within the EKS cluster was indeed possible. 


Potential consequences and remedial measures

The acquisition of secrets posed a considerable threat to the platform had they fallen into malicious hands. In shared environments, such as this, compromised secrets can pave the way for cross-tenant access and the inadvertent leakage of sensitive data. 

To address this vulnerability, the implementation of IMDSv2 with Hop Limit is strongly advised. This measure serves to prevent pods from reaching the IMDS and extracting the node’s role within the cluster, thereby bolstering the overall security posture. 


Researching and exploring Hugging Face Spaces

As previously noted, Spaces is a distinct offering within Hugging Face that enables users to deploy their AI-driven applications on Hugging Face’s infrastructure, facilitating collaborative development and public showcasing of the application.

Interestingly, Hugging Face only necessitates a Dockerfile from users to execute their application on the Spaces service. 

Executing remote code via customized Dockerfile

The investigation involvement commenced with the provision of a Dockerfile designed to trigger a malicious payload via the CMD instruction, dictating the program to execute upon the docker container’s initialization.

Following successful code execution and subsequent exploration of the environment, the researchers observed considerable restrictions and isolation.

Consequently, they opted to employ the RUN instruction instead of CMD, granting them the ability to execute code during the build process and potentially encounter a distinct environment. 


Network segmentation issue Write permissions on centralized container registry 

Upon executing code during the image building phase, the attackers utilized the netstat command to inspect network connections originating from their system.

One connection was traced to an internal container registry where the constructed layers were pushed. This aligns with the standard practice of storing images in a container registry. However, this registry didn’t solely serve the researcher’s needs but also catered to other customers of Hugging Face.

Due to inadequate scoping, it was feasible to pull and push (including overwriting) all available images within that container registry. 


Key insights

This study emphasizes the security dangers of using untrusted AI models, especially Pickle-serialized ones, highlighting the risk of arbitrary code execution on infrastructure. As AI rapidly advances, organizations must oversee and govern their AI stack, analyzing risks like malicious model use and vulnerabilities.

Collaboration between security experts and developers is vital for understanding and mitigating these risks. Hugging Face’s adoption of Wiz CSPM and regular security assessments exemplify proactive steps towards safeguarding against potential threats.


Next steps 

Each new vulnerability is a reminder of where we stand and what we need to do better. Check out the following resources to help you maintain cyber hygiene and stay ahead of the threat actors: 

  1. The state of AI in cyber security: our research so far
  2. The MITRE ATT&CK framework: Getting started
  3. The true impact of exploitable vulnerabilities for 2024
  4. Multi-cloud security challenges – a best practice guide
  5. How to properly tackle zero-day threats

Free for risk owners

Set up in minutes to aggregate and prioritize cyber risk across all your assets and attack vectors.

“The only free RBVM tool out there The only free RBVM tool lorem ipsum out there. The only”.

Name Namerson
Head of Cyber Security Strategy