Universal and Transferable Adversarial Attacks on Aligned Language Models

publication

Entity profile Source checks

Child of Center for AI Safety (CAIS)

Metadata

Source Table	`publications`
Source ID	`xFD4v0FaVJ`
Description	Andy Zou, Zifan Wang, Nicholas Carlini et al., 2023
Source URL	llm-attacks.org/
Parent	Center for AI Safety (CAIS)
Children	—
Created	Mar 23, 2026, 2:46 PM
Updated	Mar 23, 2026, 2:46 PM
Synced	Mar 23, 2026, 2:46 PM

Record Data

`id`	xFD4v0FaVJ
`entityId`	Center for AI Safety (CAIS)(organization)
`entityDisplayName`	—
`resourceId`	—
`title`	Universal and Transferable Adversarial Attacks on Aligned Language Models
`authors`	Andy Zou, Zifan Wang, Nicholas Carlini et al.
`url`	llm-attacks.org/
`venue`	—
`publishedDate`	2023
`publicationType`	paper
`citationCount`	—
`isFlagship`	Yes
`abstract`	—
`source`	llm-attacks.org/
`notes`	Highly influential jailbreaking paper

Source Check Verdicts

confirmed95% confidence

Last checked: 4/29/2026

1 → confirmed

Debug info

Thing ID: xFD4v0FaVJ

Source Table: publications

Source ID: xFD4v0FaVJ

Parent Thing ID: sid_y4bieqSeag