Skip to content
Longterm Wiki
Back

A Timing Problem for Instrumental Convergence

web

Authors

Rhys Southan·Helena Ward·Jen Semler

Credibility Rating

4/5
High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Springer

A peer-reviewed philosophical article examining timing issues in instrumental convergence theory, which is relevant to understanding how and when AI systems might pursue convergent instrumental goals regardless of their final objectives.

Paper Details

Citations
0
Year
2025
Methodology
peer-reviewed
Categories
Philosophical Studies

Metadata

journal articleprimary source

Cached Content Preview

HTTP 200Fetched Apr 9, 20261 KB
# A timing problem for instrumental convergence
Authors: Rhys Southan, Helena Ward, Jen Semler
Journal: Philosophical Studies
Published: 2025-07-03
DOI: 10.1007/s11098-025-02370-4
## Abstract

Abstract Those who worry about a superintelligent AI destroying humanity often appeal to the instrumental convergence thesis—the claim that even if we don’t know what a superintelligence’s ultimate goals will be, we can expect it to pursue various instrumental goals which are useful for achieving most ends. In this paper, we argue that one of these proposed goals is mistaken. We argue that instrumental goal preservation—the claim that a rational agent will tend to preserve its goals because that makes it better at achieving its goals—is false on the basis of the timing problem: an agent which abandons or otherwise changes its goal does not thereby fail to take a required means for achieving a goal it has. Our argument draws on the distinction between means-rationality (adopting suitable means to achieve an end) and ends-rationality (choosing one’s ends based on reasons). Because proponents of the instrumental convergence thesis are concerned with means-rationality, we argue, they cannot avoid the timing problem. After defending our argument against several objections, we conclude by considering the implications our argument has for the rest of the instrumental convergence thesis and for AI safety more generally.
Resource ID: 908c9bc04dcf353f | Stable ID: sid_lb09XoooUD