Back
Jason Wei of Google Brain
webquantamagazine.org·quantamagazine.org/how-quickly-do-large-language-models-l...
Relevant to AI safety discussions about whether dangerous capabilities can emerge suddenly and without warning; the measurement-artifact hypothesis suggests better evaluation design could improve foresight into capability development.
Metadata
Importance: 62/100news articlenews
Summary
A Quanta Magazine article covering a Stanford study arguing that so-called 'emergent' abilities in large language models are not sudden or unpredictable, but appear so due to measurement choices. When different metrics are used, the abilities develop gradually and smoothly with scale, suggesting the 'phase transition' framing may be a measurement artifact rather than a genuine phenomenon.
Key Points
- •The BIG-bench benchmark found some LLM abilities appeared to jump abruptly with scale rather than improving smoothly, leading to claims of 'emergent' behavior.
- •A Stanford trio argues sudden emergence is an artifact of coarse metrics (e.g., exact match accuracy) that only register success at a late threshold.
- •Using continuous or finer-grained metrics reveals gradual, predictable improvement, undermining the phase-transition analogy.
- •The debate has direct AI safety implications: if emergence is unpredictable, dangerous capabilities could appear without warning; if predictable, they can be anticipated.
- •The article interviews Jason Wei (Google Brain/DeepMind), one of the original emergence paper authors, who acknowledges the measurement critique but defends the phenomenon's relevance.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Emergent Capabilities | Risk | 61.0 |
Cached Content Preview
HTTP 200Fetched Apr 7, 202612 KB
How Quickly Do Large Language Models Learn Unexpected Skills? | Quanta Magazine
Home
How Quickly Do Large Language Models Learn Unexpected Skills?
Comment
Save Article
Read Later
Share
Facebook
Copied!
Copy link
Email
Pocket
Reddit
Ycombinator
Comment
Comments
Save Article
Read Later
Read Later
artificial intelligence
How Quickly Do Large Language Models Learn Unexpected Skills?
By
Stephen Ornes
February 13, 2024
A new study suggests that so-called emergent abilities actually develop gradually and predictably, depending on how you measure them.
Comment
Save Article
Read Later
Kristina Armitage/ Quanta Magazine
Introduction
By Stephen Ornes
Contributing Writer
February 13, 2024
View PDF/Print Mode
Abstractions blog
artificial intelligence
computer science
large language models
machine learning
natural language processing
neural networks
All topics
Two years ago, in a project called the Beyond the Imitation Game benchmark , or BIG-bench, 450 researchers compiled a list of 204 tasks designed to test the capabilities of large language models, which power chatbots like ChatGPT. On most tasks, performance improved predictably and smoothly as the models scaled up — the larger the model, the better it got. But with other tasks, the jump in ability wasn’t smooth. The performance remained near zero for a while, then performance jumped. Other studies found similar leaps in ability.
The authors described this as “breakthrough” behavior; other researchers have likened it to a phase transition in physics, like when liquid water freezes into ice. In a paper published in August 2022, researchers noted that these behaviors are not only surprising but unpredictable, and that they should inform the evolving conversations around AI safety, potential and risk. They called the abilities “ emergent ,” a word that describes collective behaviors that only appear once a system reaches a high level of complexity.
But things may not be so simple. A new paper by a trio of researchers at Stanford University posits that the sudden appearance of these abilities is just a consequence of the way researchers measure the LLM’s performance. The abilities, they argue, are neither unpredictable nor sudden. “The transition is much more predictable than people give it credit for,” said Sanmi Koyejo ,
... (truncated, 12 KB total)Resource ID:
38328f97c152d10f | Stable ID: sid_FF86RcPxhR