EA Forum - Ought's Theory of Change
blogAuthors
Credibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: EA Forum
This post by Ought (now dissolved into Elicit) articulates the organization's strategic rationale for process-based supervision as a bridge between near-term AI utility and long-term alignment, relevant to debates about scalable oversight and reward hacking.
Forum Post Details
Metadata
Summary
Ought explains its strategic approach to AI safety and beneficial AI through process-based machine learning, where systems are trained to supervise reasoning steps rather than outcomes. This methodology underpins their tool Elicit, designed to augment human reasoning on complex problems. They argue this approach simultaneously provides near-term value and advances long-term alignment goals by reducing outcome gaming.
Key Points
- •Ought builds process-based ML systems that supervise intermediate reasoning steps rather than final outcomes, reducing misalignment risks.
- •Their flagship product Elicit is an AI research assistant aimed at scaling open-ended reasoning for complex tasks.
- •Process supervision is positioned as both practically useful short-term and a meaningful contribution to AI alignment long-term.
- •Target domains include AI governance, climate change, and economic development—areas where improved reasoning could have large positive impact.
- •The theory of change links commercial AI tooling directly to alignment research, framing them as complementary rather than competing goals.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Elicit (AI Research Tool) | Organization | 63.0 |
Cached Content Preview
# Ought's theory of change
By stuhlmueller, jungofthewon
Published: 2022-04-12
[Ought](https://ought.org/) is an applied machine learning lab. In this post we summarize our work on [Elicit](https://elicit.org/) and why we think it's important.
We'd love to get feedback on how to make Elicit more useful to the EA community, and on our plans more generally.
This post is based on two recent LessWrong posts:
* [Supervise Process, not Outcomes](https://www.lesswrong.com/posts/pYcFPMBtQveAjcSfH/supervise-process-not-outcomes)
* [Elicit: Language Models as Research Assistants](https://www.lesswrong.com/posts/s5jrfbsGLyEexh4GT/elicit-language-models-as-research-assistants)
In short
--------
Our mission is to automate and scale open-ended reasoning. To that end, we’re building Elicit, the AI research assistant.
Elicit's architecture is based on [supervising reasoning processes, not outcomes](https://www.lesswrong.com/posts/pYcFPMBtQveAjcSfH/supervise-process-not-outcomes). This is better for supporting open-ended reasoning in the short run and better for alignment in the long run.
[Over the last year](https://www.lesswrong.com/posts/s5jrfbsGLyEexh4GT/elicit-language-models-as-research-assistants#Progress_in_2021), we built Elicit to support broad reviews of empirical literature. The literature review workflow runs on general-purpose infrastructure for executing compositional language model processes. [Going forward](https://www.lesswrong.com/posts/s5jrfbsGLyEexh4GT/elicit-language-models-as-research-assistants#Roadmap_for_2022_), we'll expand to deep literature reviews, then other research workflows, then general-purpose reasoning.
Our mission
-----------
Our mission is to automate and scale open-ended reasoning. If we can improve the world’s ability to reason, we’ll unlock positive impact across many domains including AI governance & alignment, psychological well-being, economic development, and climate change.
As AI advances, the raw cognitive capabilities of the world will increase. The goal of our work is to channel this growth toward good reasoning. We want AI to be more helpful for qualitative research, long-term forecasting, planning, and decision-making than for persuasion, keeping people engaged, and military robotics.
Good reasoning is as much about process as it is about outcomes. In fact, outcomes are unavailable if we’re reasoning about the long term. So we’re generally not training machine learning models end-to-end using outcome data, but building Elicit compositionally and based on human reasoning processes.
The case for process-based ML systems
-------------------------------------
We can think about machine learning systems on a [spectrum](https://www.lesswrong.com/posts/pYcFPMBtQveAjcSfH/supervise-process-not-outcomes#The_spectrum) from process-based to outcome-based:
* Process-based systems are built on human-understandable task decompositions, with direct supervision of reasoning steps. [More](https://www.lesswron
... (truncated, 8 KB total)ce27f1ad238ab164 | Stable ID: sid_zKl0MHneCp