Critiques of prominent AI safety labs: Redwood Research

web

2023·EA Forum·forum.effectivealtruism.org/posts/DaRvpDHHdaoad9Tfu/criti...

Author

Omega

Credibility Rating

3/5

Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: EA Forum

Part of an EA Forum series critically evaluating AI safety organizations; useful for understanding community debates about research strategy and organizational effectiveness at Redwood Research.

Metadata

Importance: 38/100commentary

Summary

A critical examination of Redwood Research as an AI safety organization, evaluating its research directions, methodology, and overall impact on the field. The post likely explores concerns about the lab's approach to technical AI safety work and its effectiveness in addressing alignment challenges.

Key Points

•Provides critical analysis of Redwood Research's research agenda and methodology from an EA community perspective
•Examines whether Redwood's technical safety work is well-directed and likely to reduce existential risk
•Part of a broader series critiquing prominent AI safety organizations to foster accountability and strategic clarity
•Offers community-sourced perspectives on the strengths and weaknesses of Redwood's approach to alignment
•Useful for understanding internal EA/AI safety community debates about which research directions are most promising

Cached Content Preview

HTTP 200Fetched Apr 10, 202645 KB

# Critiques of prominent AI safety labs: Redwood Research
By Omega
Published: 2023-03-31
*Crossposted to* [*LessWrong*](https://www.lesswrong.com/posts/SuZ6Guuos7CjfwRQb/critiques-of-prominent-ai-safety-labs-redwood-research)*.*

*This is the first post in our sequence and covers Redwood Research (Redwood). We recommending reading our brief* [*introduction*](https://forum.effectivealtruism.org/posts/N4LKrktopDs5Qdqgn/an-introduction-to-critiques-of-prominent-ai-safety) *to the sequence for added context on our motivations, who we are, and our overarching views on alignment research.*

Redwood is a non-profit started in 2021 working on technical AI safety (TAIS) alignment research. Their approach is heavily informed by the work of Paul Christiano, who runs the [Alignment Research Center](https://alignment.org/) (ARC), and previously ran the language model alignment team at OpenAI. Paul originally [proposed one of Redwood's original projects](https://www.lesswrong.com/posts/pXLqpguHJzxSjDdx7/why-i-m-excited-about-redwood-research-s-current-project) and is on Redwood’s board. Redwood has strong connections with central EA leadership and funders, has received significant funding since its inception, recruits almost exclusively from the EA movement, and partly acts as a gatekeeper to central EA institutions.

We shared a draft of this document with Redwood prior to publication and are grateful for their feedback and corrections (we recommend [others also reach out similarly](https://forum.effectivealtruism.org/posts/f77iuXmgiiFgurnBu/run-posts-by-orgs)). We’ve also invited them to share their views in the comments of this post.

We would like to also invite others to share their thoughts in the comments openly if you feel comfortable, or contribute anonymously via [this form](https://forms.gle/b1i96nG8J8mnUDCx8). We will add inputs from there to the comments section of this post, but will likely not be updating the main body of the post as a result (unless comments catch errors in our writing).

Summary of our views
====================

We believe that Redwood has some serious flaws as an org, yet has received a significant amount of funding from a central EA grantmaker (Open Philanthropy). Inadequately kept in check conflicts of interest (COIs) might be partly responsible for funders giving a relatively immature org lots of money and causing some negative effects on the field and EA community. We will share our critiques of Constellation (and Open Philanthropy) in a follow-up post. We also have some suggestions for Redwood that we believe might help them achieve their goals.

Redwood is a young organization that has room to improve. While there may be flaws in their current approach, it is possible for them to learn and adapt in order to produce more accurate and reliable results in the future. Many successful organizations made significant pivots while at a similar scale to Redwood, and we remain cautiously optimistic about Redwood's future potenti

... (truncated, 45 KB total)

Resource ID: b6d60da50536f792 | Stable ID: sid_8wuGb32Ynw