Table Candidates

Open Table Candidates ToolBrowse table rows flagged as potential insight sources

How It Works

The Table Candidates tool scans structured data tables for rows with "paradoxical" or notable rating combinations that suggest insight-worthy content.

Safety Approaches Criteria

Rows are flagged when they have:

Criterion	Why It's Interesting
Capability-dominant differential progress	Safety research that primarily advances capabilities is surprising
Weak/no deception robustness	Reveals fundamental limitations of popular approaches
PRIORITIZE recommendation	Identifies underfunded high-value research
DEFUND/REDUCE recommendation	Challenges conventional wisdom on research priorities
Unclear/harmful net safety	Questions whether "safety" work actually makes things safer
Doesn't scale to superintelligence	Important limitation for long-term planning

Accident Risks Criteria

Rows are flagged when they have:

Criterion	Why It's Interesting
Catastrophic/existential severity + hard to detect	Worst-case scenarios we can't easily monitor
Lab-demonstrated + severe	Empirical evidence of serious risks
Current timeline + severe	Not hypothetical - happening now

Using the Tool

Browse candidates sorted by source (Safety Approaches, Accident Risks)
Review matched criteria to understand why each row was flagged
Check key ratings for the full context
Copy the insight template as a starting point
Refine and verify before adding to insights.yaml

Example Insights from Tables

From Safety Approaches table:

"RLHF provides primarily capability uplift (DOMINANT) with limited safety benefit (LOW-MEDIUM), and fundamentally cannot scale to superhuman tasks where humans can't evaluate outputs."

From Accident Risks table:

"Deceptive alignment represents an existential risk that is very difficult to detect. 78% alignment faking rate observed in Anthropic's 2024 study."

Table Candidates

How It Works

Safety Approaches Criteria

Accident Risks Criteria

Using the Tool

Example Insights from Tables

Related Wiki Pages

Top Related Pages

Anthropic

Deceptive Alignment

RLHF