Skip to content
Longterm Wiki
Back

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: Alignment Forum

A key MIRI-affiliated research sequence; essential reading for understanding agent-foundations approaches to alignment, particularly the theoretical gaps in classical agent models that motivate much of MIRI's and Redwood's technical research agendas.

Metadata

Importance: 82/100blog postprimary source

Summary

A foundational sequence by Scott Garrabrant and Abram Demski examining the deep theoretical challenges that arise when AI agents are embedded within—rather than external to—the environments they reason about. It addresses decision theory, world-modeling, and alignment under the realistic condition that an agent is itself a physical subsystem of the world it must model and act upon.

Key Points

  • Challenges classical AI agent models that assume a clean separation between the agent and its environment, arguing this separation breaks down for real-world AI systems.
  • Explores decision theory for embedded agents, including issues of self-reference, logical uncertainty, and how agents should reason about their own causal role.
  • Addresses 'embedded world-models': how an agent can form accurate models of a world it is physically part of, including modeling itself.
  • Covers robust delegation and subsystem alignment—how to ensure sub-components of an AI system remain aligned with the broader system's goals.
  • Frames these challenges as prerequisites for solving alignment, arguing that standard frameworks (e.g., AIXI) fail to address them adequately.

Cited by 3 pages

Cached Content Preview

HTTP 200Fetched Apr 9, 20261 KB
Jan
 FEB
 Mar
 

 
 

 
 15
 
 

 
 

 2025
 2026
 2027
 

 
 
 

 

 

 
 
success

 
fail

 
 
 
 
 
 
 
 
 
 
 

 

 
 
 
 
 
 
 
 
 

 

 About this capture
 

 

 

 

 

 

 
COLLECTED BY

 

 

 
 
Collection: Common Crawl

 

 

 Web crawl data from Common Crawl.
 

 

 

 

 

 
TIMESTAMPS

 

 

 

 

 

 

The Wayback Machine - http://web.archive.org/web/20260215042529/https://www.alignmentforum.org/s/Rm6oQRJJmhGCcLvxh

 

x

 This website requires javascript to properly function. Consider activating javascript to get access to all site functionality. 

AI ALIGNMENT FORUM

AF

Login

Embedded Agency — AI Alignment Forum

Embedded Agency

Oct 29, 2018 by abramdemski

This is a sequence by Scott Garrabrant and Abram Demski on one current way of thinking about alignment: Embedded Agency.

Full-Text Version

54Embedded Agency (full-text version)

Scott Garrabrant, abramdemski
7y

4

45Embedded Agents

abramdemski, Scott Garrabrant
7y

7

34Decision Theory

abramdemski, Scott Garrabrant
7y

14

25Embedded World-Models

abramdemski, Scott Garrabrant
7y

5

33Robust Delegation

abramdemski, Scott Garrabrant
7y

2

28Subsystem Alignment

abramdemski, Scott Garrabrant
7y

3

26Embedded Curiosities

Scott Garrabrant, abramdemski
7y

0
Resource ID: bbc4bc9c2577c2d0 | Stable ID: sid_CXneSlzNlf