Skip to content
Longterm Wiki
Index
Publication·xFD4v0FaVJ·Record

Publication: Universal and Transferable Adversarial Attacks on Aligned Language Models by Andy Zou, Zifan Wang, Nicholas Carlini et al. (2023)

Verdictconfirmed95%
1 check · 4/3/2026

The source text confirms all key fields in the record. The title matches exactly. The authors listed (Andy Zou, Zifan Wang, Nicholas Carlini et al.) are confirmed—the source shows these three plus three additional authors (Milad Nasr, J. Zico Kolter, Matt Fredrikson), so the 'et al.' notation is appropriate and accurate. The publication year 2023 is confirmed by the arxiv link (2307.15043, which is July 2023). The URL https://llm-attacks.org/ is explicitly shown as the website hosting this research. The publication type as 'paper' is confirmed by the explicit '[Paper]' link to arxiv.org/abs/2307.15043. All fields are directly supported by the source text.

Our claim

entire record
Title
Universal and Transferable Adversarial Attacks on Aligned Language Models
Authors
Andy Zou, Zifan Wang, Nicholas Carlini et al.
Published Date
2023
Publication Type
paper
Is Flagship
Yes
Notes
Highly influential jailbreaking paper

Source evidence

1 src · 1 check
confirmed95%Haiku 4.5 · 4/3/2026

NoteThe source text confirms all key fields in the record. The title matches exactly. The authors listed (Andy Zou, Zifan Wang, Nicholas Carlini et al.) are confirmed—the source shows these three plus three additional authors (Milad Nasr, J. Zico Kolter, Matt Fredrikson), so the 'et al.' notation is appropriate and accurate. The publication year 2023 is confirmed by the arxiv link (2307.15043, which is July 2023). The URL https://llm-attacks.org/ is explicitly shown as the website hosting this research. The publication type as 'paper' is confirmed by the explicit '[Paper]' link to arxiv.org/abs/2307.15043. All fields are directly supported by the source text.

Case № xFD4v0FaVJFiled 4/3/2026Confidence 95%