System-level safety approach

web

Meta AI·ai.meta.com/blog/purple-llama-open-trust-safety-generativ...

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Meta AI

Meta's Purple Llama initiative is a practical industry contribution to AI safety tooling, offering open benchmarks and classifiers relevant to deployment-time safety; useful for practitioners implementing safeguards in LLM-based products.

Metadata

Importance: 55/100blog postprimary source

Summary

Meta announces Purple Llama, an umbrella project releasing open-source trust and safety tools for generative AI developers. The initial release includes CyberSec Eval (cybersecurity safety benchmarks for LLMs) and Llama Guard (an input/output safety classifier), aiming to democratize access to safety infrastructure for responsible AI deployment.

Key Points

•Purple Llama is an open-source umbrella project providing trust and safety tools and evaluations for generative AI developers.
•CyberSec Eval provides benchmarks specifically for evaluating cybersecurity risks posed by large language models.
•Llama Guard is a safety classifier for filtering LLM inputs and outputs, optimized for ease of deployment.
•The project aims to level the playing field so developers of all sizes can implement responsible AI practices.
•Tools are released in alignment with Meta's Responsible Use Guide to support safe deployment of generative AI.

Cited by 1 page

Page	Type	Quality
AI Governance and Policy	Crux	66.0

Cached Content Preview

HTTP 200Fetched Apr 7, 20268 KB

Announcing Purple Llama: Towards open trust and safety in the new world of generative AI 
 
 
 
 
 
 
 
 
 
 
 

 Meta AI 
 AI Research 
 The Latest 
 About 
 Get Llama 
 Try Meta AI 
 
 Developer Tools Announcing Purple Llama: Towards open trust and safety in the new world of generative AI 

 December 7, 2023 • 3 minute read We’re announcing Purple Llama, an umbrella project featuring open trust and safety tools and evaluations meant to level the playing field for developers to responsibly deploy generative AI models and experiences in accordance with best practices shared in our Responsible Use Guide .
 As a first step, we are releasing CyberSec Eval , a set of cybersecurity safety evaluations benchmarks for LLMs; and Llama Guard , a safety classifier for input/output filtering that is optimized for ease of deployment.
 Aligned with our open approach we look forward to partnering with the newly announced AI Alliance, AMD, AWS, Google Cloud, Hugging Face, IBM, Intel, Lightning AI, Microsoft, MLCommons, NVIDIA, Scale AI, and many others to improve and make those tools available to the open source community.
 RECOMMENDED READS

 Meta and Microsoft Introduce the Next Generation of Llama 
 Celebrating 10 years of FAIR: A decade of advancing the state-of-the-art through open research 
 AI Alliance Launches as an International Community of Leading Technology Developers, Researchers, and Adopters Collaborating Together to Advance Open, Safe, Responsible AI 
 Generative AI has brought about a new wave of innovation unlike we’ve ever seen before. With it, we have the ability to converse with conversational AIs, generate realistic imagery, and accurately summarize large corpora of documents, all from simple prompts. With over 100 million downloads of Llama models to date, a lot of this innovation is being fueled by open models.

 Collaboration on safety will build trust in the developers driving this new wave of innovation, and requires additional research and contributions on responsible AI. The people building AI systems can’t address the challenges of AI in a vacuum, which is why we want to level the playing field and create a center of mass for open trust and safety.

 Today, we are announcing the launch of Purple Llama — an umbrella project that, over time, will bring together tools and evaluations to help the community build responsibly with open generative AI models. The initial release will include tools and evaluations for cybersecurity and input/output safeguards, with more tools to come in the near future.

 Components within the Purple Llama project will be licensed permissively, enabling both research and commercial usage. We believe this is a major step towards enabling community collaboration and standardizing the development and usage of trust and safety tools for generative AI development.

 The first step forward

 Cybersecurity and LLM prompt safety are important areas for generative AI safety today. We have prioritized these considerat

... (truncated, 8 KB total)

Resource ID: 315e5a93a4e6fa9f | Stable ID: sid_tSN2VrnsnT