Skip to content

AI for Human Reasoning Fellowship

📋Page Status
Page Type:ResponseStyle Guide →Intervention/response page
Quality:55 (Adequate)⚠️
Importance:45 (Reference)
Last edited:2026-02-01 (5 days ago)
Words:2.2k
Structure:
📊 10📈 0🔗 15📚 5514%Score: 13/15
LLM Summary:FLF's inaugural 12-week fellowship (July-October 2025) combined research fellowship with startup incubator format. 30 fellows received $25-50K stipends to build AI tools for human reasoning. Produced 25+ projects across epistemic tools (Community Notes AI, fact-checking), forecasting (Deliberation Markets, scenario planning), negotiation mediation, and collective decision-making. Notable outputs include world's first AI-written approved Community Note, Polis 2.0 survey of 1,000+ Americans on AI, and multiple open-source tools.
Issues (2):
  • QualityRated 55 but structure suggests 87 (underrated by 32 points)
  • Links1 link could use <R> components
DimensionAssessmentEvidence
Program TypeFellowship + Incubator HybridCombines research exploration with prototype building
Scale30 fellowsInaugural cohort, July-October 2025
Funding$25K-$50K per fellowBased on experience level
Outputs25+ projectsMost with working demos, papers, or deployed tools
Focus AreaEpistemic tools & coordinationAI for sensemaking, forecasting, negotiation, fact-checking
LocationSF Bay AreaShared office space; some remote participation
Notable AchievementFirst AI-approved Community NoteNathan Young’s bot wrote world’s first approved AI community note on X
AttributeDetails
Full NameAI for Human Reasoning Fellowship
OrganizerFuture of Life Foundation (FLF)
Duration12 weeks (July 14 - October 3, 2025)
Stipend$25,000 (junior) to $50,000 (senior)
Fellows30 participants
LocationSF Bay Area shared office
Program DirectorBen Goldhaber
Program ManagersTimothy Telleen Lawton, Kathleen Finlinson
Websiteaiforhumanreasoning.com
Organizer Websiteflf.org
Est. Program Cost$1-2M total (stipends + operations)

The AI for Human Reasoning Fellowship was an inaugural program run by the Future of Life Foundation (FLF) that brought together researchers, builders, and entrepreneurs to develop AI tools designed to enhance human reasoning and coordination.1 The program operated as a hybrid between a research fellowship and a startup incubator, with fellows receiving substantial stipends ($25K-$50K) to explore and prototype beneficial AI applications.

The fellowship addressed what FLF describes as a critical gap: the world is “radically underinvested” in AI applications that could enhance human decision-making and coordination capabilities.2 While much attention focuses on AI risks, relatively little goes toward building tools that could help humanity navigate complex challenges—including those posed by AI itself.

The program structure consisted of three phases:

  1. Explore phase: Research, discussion, and ideation on potential projects
  2. Build phase: Creating prototypes and real-world implementations
  3. Translation phase: Polishing work, reflection, and public presentation at demo day

The fellowship targeted six key application areas for AI-augmented human reasoning:

AreaDescriptionExample Projects
Epistemic ToolsFact-checking, rhetoric detection, information verificationCommunity Notes AI, Evidentry, Epistemic Evals
Forecasting & ScenariosPrediction markets, strategic foresight, scenario planningDeliberation Markets, Deep Future, Sentinel
NegotiationAI-mediated high-stakes bargainingNegotiation Station
Decision SupportReasoning scaffolds, bias navigationConfidence Interval, Chord
EvaluationsBenchmarking epistemic virtue and AI trustworthinessDeliberationBench, Society Library evals
CoordinationConsensus-finding, collective sensemakingPolis 2.0, Pivotal, Updraft

Community Notes AI (Nathan Young, Robert Gordon) Built an AI system that writes Community Notes for X (formerly Twitter). Achieved a significant milestone: the world’s first AI-written Community Note to be approved through X’s rating system. The bot’s notes have been viewed over 2.5 million times.3

Open Note Network (Steve Isley) An AI system generating Community Notes paired with a dedicated website hosting long-form fact checks, linking short-form social media corrections to comprehensive analysis.

AI for the Epistemic Commons (Herbie Bradley) Research on “Community Notes everywhere” for browsers and AI-written Wikipedia improvements. Built evaluations measuring model capability to fix errors and expand Wikipedia articles.

Evidentry (Agita Pasaribu) Coalition infrastructure connecting survivors, platforms, and regulators to verify and remove AI-generated intimate imagery. Features multi-detector aggregation and verification workflows that reduce removal time from days to minutes.

Deliberation Markets (Siddarth Srinivasan) A novel prediction market mechanism where participants write explanations supervised by LLMs. Instead of buying YES/NO contracts, users provide reasoning that LLMs synthesize into probability estimates.

Deep Future (Gordon Brander) Strategic foresight tool powered by scenario methods from US military and RAND. AI agents map strategic landscapes, identify driving forces, explore trajectories, and discover leverage points.

Sentinel (Nuno Sempere) Systems detecting and tracking global risks including the “Eye of Sauron” monitoring system, xrisk.fyi tool, automated Twitter reports, and forecasting infrastructure using HDBSCAN clustering for risk analysis.

Polis 2.0 (Colin Megill, Maximilian Kroner Dale) Real-time system gathering and analyzing what large groups think using advanced statistics and ML. Conducted a survey with 1,000+ quota-sampled Americans who voted 90,000+ times on 1,000+ statements about AI concerns.

Chord (Alex Bleakley) AI-orchestrated communication tool that parallelizes conversations to help groups make better, faster decisions. Led to founding of Sylience post-fellowship.

Pivotal (Anand Shah, Parker Whitfill, Kai Sandbrink, Ben Sklaroff) Multi-agent orchestration tool helping teams coordinate on scheduling, action items, and organizational context. Integrates with existing software and automates workflows.

Updraft (Robert Gordon) Real-time facilitation tool where groups and AI collaboratively map, cluster, and evolve ideas on a shared 2D canvas. Part of a connected experiment suite including Winnow and Prune.

DeliberationBench (Maximilian Kroner Dale, Luke Hewitt, Paul de Font-Reaulx) Novel AI persuasiveness benchmark using Deliberative Polls as a normative reference. Demonstrated findings in a 4,000-person randomized LLM persuasiveness experiment.

Epistemic Evals (Alejandro Botas) Evaluated human and model outputs for epistemic quality attributes. Used LLMs to assess EA Forum posts on Reasoning/Clarity/Value dimensions and tested model sensitivity to epistemically irrelevant contexts.

Society Library (Jamie Joyce) Built a semi-automated information processing pipeline producing a 600+ page intelligence report on complex government events with multiple viewpoints. Researching how structured datasets can serve as benchmarks for LLM truth-seeking.

ProjectFellow(s)DescriptionVideo
Collective AgencyBlake BorgesonFramework for increasing humanity’s collective agency through AI intermediaries. Presented motifs for group collaboration (facilitators, orchestrators, AI intermediaries) and advocates for “shovel-ready wisdom”—ideas for better collective processes ready to become technology.Watch
Future Visions HubSofia VanhanenGroup decision-making software and epistemic infrastructure for collective sensemaking about desirable futuresWatch
Confidence IntervalVaughn TanSelf-service webapp using LLMs as Socratic mirror for making subjective arguments rigorous. Currently used by college students with interest from startups and governments.Watch
Negotiation StationKai SandbrinkAI tools for high-stakes negotiations between nations/corporations as trusted mediatorsWatch
AI Policy SimulationAlexander van Grootel, Emma KumlebenAI aiding institutional decision-making through strategic foresight and forecasting for AGI transition navigationWatch
VirtuousPaul de Font-ReaulxEpistemic evaluations for frontier models; developing DeliberationBenchWatch
RiskWatchAlyssia Jovellanos, Martin Ciesielski-ListwanRisk Threat Observatory enabled by prediction marketsWatch
Worker-Owned StartupsBen SklaroffGovernance models for worker-owned startup structuresWatch
Agent Strategy ArenaJoshua LevyPlatform for scalable, grounded evaluations of AI agents’ prediction accuracyWatch
AI Discourse SensemakingMatthew Brooks, Emma Kumleben, Niki DupuisUsing LLMs to map opinion landscapes and detect polarization; built semi-automated argument mapperWatch

The 30 fellows came from diverse backgrounds including academia, entrepreneurship, policy, and technology:

FellowBackground/Affiliation
Blake BorgesonCollaboration AI researcher
Colin MegillPolis creator
Nathan YoungManifold community, forecasting
Robert GordonGoodheart Labs
Herbie BradleyAI researcher
Nuno SempereSamotsvety, QURI, Sentinel founder
Kai SandbrinkMulti-project contributor
Gordon BranderStrategic foresight
Jamie JoyceSociety Library
Sofia VanhanenFuture Visions Hub
Agita PasaribuEvidentry
And 19 othersVarious backgrounds

The fellowship included advisors with expertise in AI safety, mechanism design, and coordination:

  • Anthony Aguirre - President of FLF, Executive Director of Future of Life Institute
  • Andreas Stuhlmüller - Founder of Elicit
  • Brendan Fong - Category theory and applied mathematics
  • Additional advisors from academia and industry

The Future of Life Foundation (FLF) is a separate organization from the Future of Life Institute (FLI), though both share leadership (Anthony Aguirre serves as Executive Director of FLI and President of FLF). While FLI focuses primarily on existential risk advocacy and grantmaking, FLF operates more as an incubator for beneficial AI applications.

The fellowship produced several concrete outcomes:

Outcome TypeCount/Details
Projects launched25+ with demos or papers
Open-source toolsMultiple GitHub repositories
Academic papersDeliberationBench, Deliberation Markets
Companies foundedSylience (from Chord project)
Real-world deploymentAI Community Notes with 2M+ views
Research artifactsEvaluations, benchmarks, datasets

FLF indicated willingness to provide funding beyond the fellowship period or assist fellows in launching new organizations based on their work.2

All fellows presented their work at a demo day. Video presentations with auto-generated transcripts are available on YouTube.

ProjectPresenter(s)Video
Collective AgencyBlake BorgesonWatch
PivotalAnand Shah, Parker Whitfill, Kai Sandbrink, Ben SklaroffWatch
Polis 2.0Colin Megill, Maximilian Kroner DaleWatch
Deliberation MarketsSiddarth SrinivasanWatch
Community Notes AINathan Young, Robert GordonWatch
Open Note NetworkSteve IsleyWatch
AI for Epistemic CommonsHerbie BradleyWatch
EvidentryAgita PasaribuWatch
Worker-Owned StartupsBen SklaroffWatch
Society LibraryJamie JoyceWatch
AI Discourse SensemakingMatthew Brooks, Emma Kumleben, Niki DupuisWatch
Confidence IntervalVaughn TanWatch
VirtuousPaul de Font-ReaulxWatch
Epistemic EvalsAlejandro BotasWatch
DeliberationBenchMax Kroner Dale, Luke Hewitt, Paul de Font-ReaulxWatch
Forecasting & ProvenanceAlyssia Jovellanos, Martin Ciesielski-ListwanWatch
Agent Strategy ArenaJoshua LevyWatch
AI Policy SimulationAlexander van Grootel, Emma KumlebenWatch
Deep FutureGordon BranderWatch
Future Visions HubSofia VanhanenWatch
Tools for SensemakingMatthew BrooksWatch
Negotiation StationKai SandbrinkWatch
ChordAlex BleakleyWatch

Key insights from the demo day presentations (based on transcript analysis):

  • Polis 2.0: Surveyed 1,000+ Americans who voted 90,000+ times on AI concerns. Found bridging consensus on deep fakes and privacy, with partisan split on regulation approach. New features include semantic topic clustering and LLM-generated consensus summaries.
  • Deliberation Markets: Live demo showed market probability shifting 32%→65% from a single well-reasoned argument—demonstrating how explanation quality drives predictions. Core innovation: LLMs evaluate reasoning quality and trade on synthesized probabilities.
  • Community Notes AI: Early notes reduce misleading tweet shares by 25-50%; late notes have almost no effect. Team estimates $150K-$500K needed to scale to TikTok, Chrome extensions, and Perplexity.
  • Deep Future: Compresses week-long RAND-style scenario planning workshops into 10-15 minutes. Demo identified 38 driving forces for “How will AI agents transform the web by 2030?” and generated strategic reports with opportunities, threats, and early warning signals.
  • Collective Agency: Blake Borgeson presented motifs for AI-human collaboration patterns—facilitators, orchestrators, and AI intermediaries—across collaboration phases (understand, explore, decide, coordinate, create, share, reflect).
StrengthEvidence
Novel focus areaFew programs specifically target AI for epistemics/coordination
Concrete outputsMost projects have working demos, not just research
Diverse approachesCovered forecasting, fact-checking, negotiation, collective decision-making
Real deploymentSome tools already in use (Community Notes AI)
Open sourceMany projects released code publicly
LimitationNotes
First cohortNo track record yet for long-term impact
Prototype stageMost projects still early; unclear which will scale
Narrow ecosystemFellows largely from EA/rationalist adjacent networks
Evaluation difficultyHard to measure impact of “reasoning improvement” tools
ItemRelationship
Future of Life InstituteRelated organization; shared leadership with FLF
QURISimilar focus on epistemic tools; Nuno Sempere connection
Manifold MarketsPrediction market platform; Nathan Young connection
ElicitAI research tool; Andreas Stuhlmüller as advisor
AI Safety Training ProgramsComplementary fellowship in different focus area
  1. AI for Human Reasoning Fellowship website

  2. FLF Fellowship on AI for Human Reasoning: $25-50k, 12 weeks, EA Forum 2

  3. World’s First AI Community Note, Nathan Young’s Substack