Anthropic Raises Series B to build steerable, interpretable, robust AI systems \ Anthropic

web

Anthropic·anthropic.com/news/anthropic-raises-series-b-to-build-saf...

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Anthropic

Anthropic's Series B funding announcement outlines the company's early research agenda in AI safety, including interpretability, steerability, and robustness, signaling institutional commitment to safety-focused AI development at scale.

Metadata

Importance: 38/100press releasenews

Summary

Anthropic announced a $580 million Series B round in April 2022 to fund large-scale AI safety research infrastructure. The company outlined progress in interpretability (reverse engineering language model behavior), steerability, and robustness, including work on making models more helpful and harmless. The funding aims to develop AI systems with better implicit safeguards and tools to verify those safeguards work.

Key Points

•Anthropic raised $580M Series B led by Sam Bankman-Fried to fund AI safety research and infrastructure.
•Research focus areas include interpretability, steerability, and robustness of large language models.
•Progress made on mathematically reverse engineering small language models and understanding pattern-matching in large ones.
•Company released datasets and techniques to help other labs train models more aligned with human preferences.
•Plans to expand policy and societal impact research alongside technical safety work.

1 FactBase fact citing this source

Entity	Property	Value	As Of
Anthropic	Valuation	$4 billion	Apr 2022

Cached Content Preview

HTTP 200Fetched Apr 7, 20263 KB

Anthropic Raises Series B to build steerable, interpretable, robust AI systems \ Anthropic Announcements Anthropic Raises Series B to build steerable, interpretable, robust AI systems

 Apr 29, 2022 Anthropic, an AI safety and research company, has raised $580 million in a Series B. The financing will help Anthropic build large-scale experimental infrastructure to explore and improve the safety properties of computationally intensive AI models.

Since its founding at the beginning of 2021, Anthropic has conducted research into making systems that are more steerable, robust, and interpretable. On interpretability, it has made progress in mathematically reverse engineering the behavior of small language models and begun to understand the source of pattern-matching behavior in large language models . On steerability and robustness, it has developed baseline techniques to make large language models more “helpful and harmless” , and followed this up with reinforcement learning to further improve these properties , as well as releasing a dataset to help other research labs train models that are more aligned with human preferences . It has also released an analysis of sudden changes in performance in large language models and the societal impacts of this phenomenon , which demonstrates the need for studying safety issues at scale.

The purpose of this research is to develop the technical components necessary to build large-scale models which have better implicit safeguards and require less after-training interventions, as well as to develop the tools necessary to further look inside these models to be confident that the safeguards actually work. The company is also building out teams and partnerships dedicated to exploring the policy and societal impacts of these models.

“With this fundraise, we’re going to explore the predictable scaling properties of machine learning systems, while closely examining the unpredictable ways in which capabilities and safety issues can emerge at-scale,” said Anthropic co-founder and CEO Dario Amodei. “We’ve made strong initial progress on understanding and steering the behavior of AI systems, and are gradually assembling the pieces needed to make usable, integrated AI systems that benefit society.”

Anthropic is now a growing team of around 40 people based in a plant-filled office in San Francisco, California, with plans to expand further this year. “Now that we’ve built out the organization, we’re focusing on ensuring Anthropic has the culture and governance to continue to responsibly explore and develop safe AI systems as we scale,” said Anthropic co-founder and President Daniela Amodei. “We’re excited about what’s ahead, and grateful to all be working together.”

The Series B follows the company raising $124 million in a Series A round in 2021. The Series B round was led by Sam Bankman-Fried, CEO of FTX. The round also included participation from Caroline Ellison, Jim McClave, Nishad Singh, Jaan Tallinn, and the Cente

... (truncated, 3 KB total)

Resource ID: kb-c9af86716706db52