Developing noise-injection methods to reveal and reduce deceptive behaviors in language models prior to deployment

$40K

Funder

Long-Term Future Fund (LTFF)wiki

Recipient

Adelin Kassler

Program

Long-Term Future Fund Grant Rounds

Date

Jul 2024

Data source

Source

funds.effectivealtruism.org↗

Notes

[Long-Term Future Fund] Developing noise-injection methods to reveal and reduce deceptive behaviors in language models prior to deployment

Other Grants by Long-Term Future Fund (LTFF)

544

Grant	Recipient	Amount	Date
6-month salary to translate AGI safety-related texts, e.g. LessWrong and AI Alignment Forum, into Russian	Maksim Vymenets	$13K	Jan 2022
Working on long-term macrostrategy and AI Alignment, and up-skilling and career transition towards that goal	Tushant Jha	$40K	Jan 2020
Characterizing the properties and constraints of complex systems and their external interactions to inform AI safety research	Alexander Siegenfeld	$20K	Jul 2019
6-month salary to write a book on philosophy + history of longtermist thinking, while longer-term funding is arranged	Thomas Moynihan	$28K	Oct 2021
12-month salary for researching value learning	Charlie Steiner	$50K	Jan 2022
Conducting a computational study on using a light-to-vibrations mechanism as a targeted antiviral.	Gavin Taylor	$30K	Jul 2020
Support Sam's participation in ‘Mid-term AI impacts’ research project	Sam Clarke	$4.5K	Oct 2020
PhD at Cambridge	Richard Ngo	$150K	Jul 2020
Funding a nordic conference for senior X-risk researchers and junior talents interested in entering the field	Effektiv Altruism Sverige (EA Sweden)	$4.6K	Oct 2021
Funding for a degree in the Biological Sciences at UCSD (University of California San Diego)	Kristaps Zilgalvis	$250K	Oct 2021

Showing 10 of 544 grants

← Back to Long-Term Future Fund (LTFF)All grants