Back
[2404.12241] Introducing v0.5 of the AI Safety Benchmark from MLCommons
paperCredibility Rating
3/5
Good(3)Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: arXiv
Introduces MLCommons' AI Safety Benchmark v0.5, a standardized evaluation framework developed by a large industry-academic consortium to measure and compare AI safety across models, directly relevant to AI safety evaluation methodology.
Metadata
Importance: 72/100arxiv preprintprimary source
Summary
This paper presents v0.5 of the MLCommons AI Safety Benchmark, developed by a large consortium of industry and academic researchers. The benchmark aims to standardize the evaluation of AI safety across models and systems. Its primary goal is to advance the state of the art in AI safety evaluation and stimulate innovation in safety practices.
Key Points
- •Introduces a standardized AI Safety Benchmark (v0.5) created by the MLCommons AI Safety Working Group, a broad industry-academic consortium.
- •The benchmark is designed to enable consistent, comparable evaluation of AI safety properties across different models and systems.
- •Developed collaboratively by researchers from Stanford, Meta, Google, Microsoft, NVIDIA, and many other institutions.
- •Aims to facilitate better AI safety processes and stimulate innovation in safety evaluation methodologies.
- •Represents an early-stage (v0.5) but significant step toward community-wide standards for AI safety benchmarking.
1 FactBase fact citing this source
| Entity | Property | Value | As Of |
|---|---|---|---|
| GovAI | publication | Safety Cases for Frontier AI — argues for structured safety arguments analogous to safety cases in other high-risk industries | 2024 |
Cached Content Preview
HTTP 200Fetched Apr 7, 202698 KB
[2404.12241] Introducing v0.5 of the AI Safety Benchmark from MLCommons
Introducing v0.5 of the AI Safety Benchmark
from MLCommons
Bertie Vidgen 1 Adarsh Agrawal 53 Ahmed M. Ahmed 2,9 Victor Akinwande 60 Namir Al-Nuaimi 56 Najla Alfaraj 64 Elie Alhajjar 4 Lora Aroyo 5 Trupti Bavalatti 6 Borhane Blili-Hamelin 62 Kurt Bollacker 1 Rishi Bomassani 2 Marisa Ferrara Boston 7 Siméon Campos 66 Kal Chakra 3 Canyu Chen 8 Cody Coleman 9 Zacharie Delpierre Coudert 6 Leon Derczynski 10 Debojyoti Dutta 11 Ian Eisenberg 12 James Ezick 13 Heather Frase 14 Brian Fuller 6 Ram Gandikota 15 Agasthya Gangavarapu 16 Ananya Gangavarapu 17 James Gealy 66 Rajat Ghosh 11 James Goel 13 Usman Gohar 18 Sujata Goswami 3 Scott A. Hale 24, 63 Wiebke Hutiri 19 Joseph Marvin Imperial 20,55 Surgan Jandial 21 Nick Judd 32 Felix Juefei-Xu 22 Foutse Khomh 23 Bhavya Kailkhura 35 Hannah Rose Kirk 24 Kevin Klyman 2 Chris Knotz 25 Michael Kuchnik 26 Shachi H. Kumar 27 Chris Lengerich 28 Bo Li 29 Zeyi Liao 30 Eileen Peters Long 10 Victor Lu 3 Yifan Mai 2 Priyanka Mary Mammen 31 Kelvin Manyeki 61 Sean McGregor 32 Virendra Mehta 33 Shafee Mohammed 34 Emanuel Moss 27 Lama Nachman 27 Dinesh Jinenhally Naganna 15 Amin Nikanjam 23 Besmira Nushi 36 Luis Oala 37 Iftach Orr 56 Alicia Parrish 5 Cigdem Patlak 3 William Pietri 1 Forough Poursabzi-Sangdeh 38 Eleonora Presani 6 Fabrizio Puletti 12 Paul Röttger 39 Saurav Sahay 27 Tim Santos 57 Nino Scherrer 40 Alice Schoenauer Sebag 59 Patrick Schramowski 41 Abolfazl Shahbazi 42 Vin Sharma 43 Xudong Shen 44 Vamsi Sistla 45 Leonard Tang 58 Davide Testuggine 6 Vithursan Thangarasa 54 Elizabeth Anne Watkins 27 Rebecca Weiss 1 Chris Welty 5 Tyler Wilbers 42 Adina Williams 26 Carole-Jean Wu 26 Poonam Yadav 47 Xianjun Yang 48 Yi Zeng 49 Wenhui Zhang 50 Fedor Zhdanov 51 Jiacheng Zhu 52 Percy Liang 2 Peter Mattson 65 Joaquin Vanschoren 46
1 MLCommons 2 Stanford University 3 Independent 4 RAND 5 Google Research 6 Meta 7 Reins AI 8 Illinois Institute of Technology 9 Coactive AI 10 NVIDIA 11 Nutanix 12 Credo AI 13 Qualcomm Technologies, Inc. 14 Center for Security and Emerging Technology 15 Juniper Networks 16 Ethriva 17 Caltech 18 Iowa State University 19 Sony AI 20 University of Bath 21 Adobe 22 New York University 23 Polytechnique Montreal 24 University of Oxford 25 Commn Ground 26 FAIR, Meta 27 Intel Labs 28 Context Fund 29 University of Chicago 30 The Ohio State University 31 UMass Amherst 32 Digital Safety Research Institute 33 University of Trento 34 Project Humanit.ai 35 Lawrence Livermore National Laboratory 36 Microsoft Research 37 Dotphoton 38 Microsoft 39 Bocconi University 40 Patronus AI 41 DFKI & Hessian.AI 42 Intel Corporation
... (truncated, 98 KB total)Resource ID:
kb-12b3d02782623a4f