RoastMyPost
- QualityRated 35 but structure suggests 73 (underrated by 38 points)
Quick Assessment
Section titled “Quick Assessment”| Dimension | Assessment | Evidence |
|---|---|---|
| Innovation | Moderate | Multi-agent evaluation approach for document review |
| Practical Impact | Growing | Useful for pre-publication review of research posts |
| Technical Maturity | Experimental | Developer acknowledges significant false positive rate |
| Integration | Good | Direct import from LessWrong and EA Forum |
| Accessibility | High | Free, web-based, no setup required |
| Output Quality | Mixed | Helpful for catching errors but requires human filtering |
Project Details
Section titled “Project Details”| Attribute | Details |
|---|---|
| Name | RoastMyPost |
| Organization | QURIOrganizationQURI (Quantified Uncertainty Research Institute)QURI develops Squiggle (probabilistic programming language with native distribution types), SquiggleAI (Claude-powered model generation producing 100-500 line models), Metaforecast (aggregating 2,1...Quality: 48/100 (Quantified Uncertainty Research Institute) |
| Lead | Ozzie Gooen |
| Launched | December 2025 |
| Primary Model | Claude Sonnet 4.5 |
| Fact-Checking | Perplexity integration |
| Website | roastmypost.org |
| Source | GitHub (open-source) |
Overview
Section titled “Overview”RoastMyPost is an experimental web application that uses large language modelsCapabilityLarge Language ModelsComprehensive analysis of LLM capabilities showing rapid progress from GPT-2 (1.5B parameters, 2019) to o3 (87.5% on ARC-AGI vs ~85% human baseline, 2024), with training costs growing 2.4x annually...Quality: 60/100 to evaluate written content through multiple specialized AI evaluators.1 Developed by Ozzie Gooen at QURIOrganizationQURI (Quantified Uncertainty Research Institute)QURI develops Squiggle (probabilistic programming language with native distribution types), SquiggleAI (Claude-powered model generation producing 100-500 line models), Metaforecast (aggregating 2,1...Quality: 48/100, the platform analyzes documents for errors, logical fallacies, factual inaccuracies, and other issues that human reviewers might miss or find tedious to check manually.
The tool is designed to provide “roasts” — critical feedback that highlights potential problems in written work before publication. Unlike general-purpose AI assistants, RoastMyPost deploys specialized evaluator agents that each focus on specific types of analysis.
The platform is particularly relevant to the AI safety and rationalist communities, as it can import posts directly from LessWrongLesswrongLessWrong is a rationality-focused community blog founded in 2009 that has influenced AI safety discourse, receiving $5M+ in funding and serving as the origin point for ~31% of EA survey respondent...Quality: 44/100 and the EA Forum via URL, making it easy to get feedback on research posts common in these communities.
How It Works
Section titled “How It Works”Import Methods
Section titled “Import Methods”- Direct text: Paste markdown content directly
- Forum URLs: Import posts from LessWrong and EA Forum automatically
- Web URLs: Extract content from general web pages
Evaluators
Section titled “Evaluators”RoastMyPost runs multiple specialized evaluators in parallel:1
| Evaluator | Function |
|---|---|
| Fact Checker | Uses Perplexity searches to verify factual claims |
| Spelling/Grammar | Identifies language errors |
| Logical Fallacy Detector | Flags potential reasoning errors |
| Math Verifier | Checks mathematical equations and calculations |
| Link Validator | Tests whether referenced URLs are accessible |
| Binary Forecast Checker | Compares predictions against actual outcomes |
| Epistemic Auditor | High-level assessment of reasoning quality |
Processing typically completes in 1-5 minutes depending on document length.
Output
Section titled “Output”- Inline annotations: Specific comments highlighted in the text with importance ratings
- Summary reports: Overall assessment and key findings
- Grades: Letter grades for different quality dimensions
- Export: XML export for further processing
Ideal Use Cases
Section titled “Ideal Use Cases”Works best with:
- Documents between 200-10,000 words
- Content containing factual claims that can be verified
- Research posts and analyses
- SquiggleConceptSquiggleSquiggle is a domain-specific probabilistic programming language optimized for intuition-driven estimation rather than data-driven inference, developed by QURI and adopted primarily in the EA commu...Quality: 41/100 probabilistic models
Less suitable for:
- Very long documents (performance issues)
- LaTeX-formatted content
- Highly specialized technical content requiring domain expertise
Limitations
Section titled “Limitations”The developers explicitly acknowledge significant limitations:1
| Limitation | Description |
|---|---|
| False positives | Significant rate of incorrect error flagging |
| Context gaps | Lacks nuanced understanding for some interpretations |
| Fallacy checker | Sometimes flags valid reasoning patterns |
| Complex fact-checking | Struggles with claims requiring multiple research iterations |
| No domain expertise | Cannot replace human expert review in specialized fields |
The platform is experimental and should be used as one input among many rather than a definitive quality assessment.
Development
Section titled “Development”Ozzie Gooen has committed to dedicating approximately one-third of his annual work time to maintaining and improving RoastMyPost.1 The roadmap includes model updates as new Claude versions become available and improved evaluator accuracy.
RoastMyPost is currently free for reasonable use, funded through QURI. Usage limits exist to prevent abuse.
Related Tools
Section titled “Related Tools”| Tool | Purpose | Relationship |
|---|---|---|
| SquiggleConceptSquiggleSquiggle is a domain-specific probabilistic programming language optimized for intuition-driven estimation rather than data-driven inference, developed by QURI and adopted primarily in the EA commu...Quality: 41/100 | Probabilistic modeling language | RoastMyPost can evaluate Squiggle models |
| SquiggleAIConceptSquiggleAISquiggleAI is an LLM tool (primarily Claude Sonnet 4.5) that generates probabilistic Squiggle models from natural language, using ~20K tokens of cached documentation to produce 100-500 line models ...Quality: 37/100 | LLM model generation | Shared LLM integration patterns |
| ElicitElicitElicit is an AI research assistant with 2M+ users that searches 138M papers and automates literature reviews, founded by AI alignment researchers from Ought and funded by Open Philanthropy ($31M to...Quality: 63/100 | Research assistant | Similar LLM-for-research space |
Sources
Section titled “Sources”Footnotes
Section titled “Footnotes”-
Announcing RoastMyPost: LLMs eval blog posts and more, EA Forum, December 2025 ↩ ↩2 ↩3 ↩4