SafetyLlama ProtectionsLlama Defenders Program

Enabling AI Defenders

Enabling developers and critical organizations to better defend key systems, services, and infrastructure in the age of AI.
llama protections

Our approach

We believe in cross-industry collaboration across organizations that play a critical role in defending the systems, services, and infrastructure that society relies on everyday. We’re excited to introduce and expand our efforts supporting our select partners through the Llama Defenders Program, as well as broadly enabling the developer community to better defend their organizations in the age of AI.
Llama Generated Audio Detector
A new model designed to classify whether a given audio file has been generated by AI.
Audio watermark detector
New audio watermarking and detection technology that provides industry leading detection performance on accuracy, imperceptibility, and speed.

ZenDesk

Zendesk is utilizing the Llama Generated Audio Detector to help them detect whether a voice is AI-generated and might be impersonating a customer or executive
llama protections
llama protections

Automatic sensitive document classification

As part of our efforts to support the defender community more broadly, we are also sharing the Automatic Sensitive Document Classification. It is a new security tool designed to automatically apply security classification labels to your organization’s internal documents to help prevent unauthorized access and distribution.Developers can access this tool through Github, and can configure customized security protections with label application, for example disabling copies, moves, or external shares for files with highly sensitive labels. These labels can also be used when setting up company-wide RAG implementations.
llama
CyberSOC Eval
In partnership with CrowdStrike, we’ve released a set of new benchmarks that provide the first framework that measures the efficacy of AI systems in representative security operation centers (SOC) tasks. These include Malware Analysis and Threat Intelligence Reasoning.
AutoPatchBench
A new benchmark that measures the ability of an AI system to automatically patch security vulnerabilities in native code. It provides a standardized way to measure the performance of automated patching agents, and enables code owners to integrate automated evaluation into development cycles.
A new benchmark that measures the ability of an AI system to automatically patch security vulnerabilities in native code. It provides a standardized way to measure the performance of automated patching agents, and enables code owners to integrate automated evaluation into development cycles.
Read the blog