Research: Adaptive Page Length & Summary Systems
Problem Statement
The wiki has ~625 pages ranging from empty stubs (0 words) to comprehensive analyses (10,000+ words). Different readers want different levels of depth:
- Skimmers want a 2-sentence takeaway
- Decision-makers want a structured 200-word summary with key numbers
- Researchers want the full article with all evidence and nuance
Currently every reader gets the same page. The wiki has a two-tier summary system (description and llmSummary in frontmatter) and infoboxes, but these aren't doing enough to serve short-attention readers.
Current State
Page Length Distribution
| Word Count | Pages | Share | Character |
|---|---|---|---|
| 0–100 | ≈98 | 16% | Stubs, tables, dashboards |
| 101–800 | ≈39 | 6% | Short reference pages |
| 801–2,000 | ≈144 | 23% | Expanded articles |
| 2,001–3,500 | ≈200 | 32% | Modal range — standard articles |
| 3,501–5,000 | ≈121 | 19% | Comprehensive analyses |
| 5,000+ | ≈24 | 4% | Hub pages (bioweapons, org profiles) |
Median: ~2,200 words. Mean: ~2,274 words. The longest page (bioweapons) is 10,646 words.
What Exists Today
Two-tier summaries:
description(frontmatter): 1–2 sentences, ~100–200 chars. Used in meta tags and OG cards.llmSummary(frontmatter): AI-generated, 200–500 chars. Shown in PageStatus component, EntityLink hover tooltips, and InfoBox fallback.
InfoBox:
- Right-floating card (280px desktop, full-width mobile) showing entity metadata, key facts, related entries, ratings, links.
- Description is CSS-truncated to 3 lines (
line-clamp-3) with no "read more" button.
PageStatus banner:
- Shows quality/importance badges, word count, and llmSummary text above the article.
- Currently a static block — no collapse/expand behavior.
Table of Contents:
- Sticky sidebar on desktop, auto-generated from H2/H3 headings.
- Scroll-tracking via IntersectionObserver.
- Only renders if 3+ headings exist.
Key gaps:
- No interactive expand/collapse on article sections.
- No user preference for "show me the short version."
- InfoBox description truncation has no expand affordance.
- Same rendering for 200-word and 10,000-word pages.
Approaches
Approach 1: Better Summaries & InfoBoxes (Low Effort, High Impact)
Don't change the page rendering model at all. Instead, invest in making the top-of-page experience much richer so skimmers never need to scroll.
What to do:
-
Structured llmSummary: Instead of a single paragraph, generate summaries with a consistent structure:
- One-liner (what is this thing, in one sentence)
- Key numbers (2–4 bullet points with the most important quantitative claims)
- Bottom line (so-what for a decision-maker)
- Store this as structured YAML in frontmatter rather than a single string.
-
Expand the InfoBox description: Add a "Show full summary" toggle to the InfoBox so the
line-clamp-3truncation is interactive. ThellmSummarytext is already there — just hidden by CSS. -
Quick Assessment tables: Many KB risk pages already have a "Quick Assessment" table at the top. Standardize this across all page types (not just risks) and enforce it in page templates.
-
Key Takeaways component: A new MDX component (
<KeyTakeaways>) that authors place after the first heading. Renders as a highlighted box with 3–5 bullet points. Unlike the auto-generated llmSummary, this is hand-curated.
Pros: No architectural changes. Works with static generation. Improves every page visit. Incremental — can improve one page at a time.
Cons: Doesn't actually shorten pages. Skimmers still see a long scrollbar. Every page still ships its full HTML.
Approach 2: Collapsible Sections (Medium Effort)
Keep full SSG but make sections below the overview collapsed by default on long pages.
What to do:
-
Auto-wrap H2 sections in a
<CollapsibleSection>component during MDX compilation. The Radix UI Collapsible primitive already exists in the codebase (app/src/components/ui/collapsible.tsx). -
Default behavior by page length:
- < 1,500 words: all sections expanded (short page, no need to collapse)
- 1,500–4,000 words: Overview expanded, rest collapsed
- > 4,000 words: Overview expanded, rest collapsed, with a "Expand all" button
-
Remember user state: Use localStorage (pattern already established with
DevModeToggleandInfoBoxVisibility) to remember which sections the user expanded. -
Section word counts: Extend
build-data.mjsto compute per-section word counts and display them as subtle badges on collapsed headers ("Risk Assessment — 840 words").
Pros: All content still pre-rendered (SEO-friendly). Readers see a manageable page. Existing Radix primitives reduce build effort. Section-level anchors and TOC continue to work.
Cons: All HTML still ships to the client (no payload savings). Requires an MDX plugin or remark plugin to auto-wrap sections. Testing the collapse/expand UX across 625 pages needs care.
Approach 3: View Mode Toggle (Medium-High Effort)
Give readers a "Summary / Standard / Full" toggle that controls how much of the page is shown.
What to do:
-
Three view modes:
- Summary: InfoBox + structured summary + key takeaways only (~200–400 words). No article body.
- Standard: Overview section + Quick Assessment + collapsed detail sections (~800–1,500 words visible).
- Full: Everything expanded, as today.
-
A
<ViewModeProvider>context (following theInfoBoxVisibilityProviderpattern) wrapping the page layout. A toggle in the page header or breadcrumb bar. -
Persist preference in localStorage. Default to "Standard" for pages over 3,000 words, "Full" for shorter pages.
-
URL parameter support:
?view=summaryso links can point to a specific view. Useful for sharing with non-technical audiences.
Pros: Directly addresses the "different readers want different depths" problem. URL parameters make it shareable. Could eventually be personalized per-user.
Cons: Significant UX design work. Need to ensure "Summary" mode is genuinely useful (not just a worse version of the page). Three modes means three things to test and maintain.
Approach 4: Per-User LLM Rewriting (High Effort, Experimental)
Dynamically rewrite pages using an LLM to match each reader's desired length and expertise level.
What to do:
-
On-demand summarization API: A serverless function that takes a page ID, target length (e.g., 200 words), and audience (e.g., "policymaker" / "ML researcher"), and returns a rewritten version using Claude.
-
Caching layer: Cache rewrites by (page_id, length, audience) tuple. Pre-generate common variants at build time for high-importance pages.
-
UI: A slider or dropdown: "Show me this in: 1 paragraph / 1 page / full article" with a loading indicator while the rewrite generates.
Pros: Most flexible. Each reader gets exactly what they want. Could support multiple languages, reading levels, or domain expertise.
Cons: Expensive (LLM call per page view, even with caching). Latency on first view. Risk of LLM introducing errors or losing nuance. Breaks the "wiki as source of truth" model — which version is canonical? Requires moving away from pure SSG. Hard to quality-control generated summaries at scale.
Approach 5: Tiered Content Authoring (High Effort, High Quality)
Author pages at multiple depths deliberately, as part of the content pipeline.
What to do:
-
Extend page templates to require three tiers of content:
- Tier 1 — Card (~50 words): Title + one-line description + key metric. Already roughly exists as
descriptionfield. - Tier 2 — Brief (~300 words): Structured summary with key numbers, main arguments, and bottom line. New frontmatter field or dedicated MDX section.
- Tier 3 — Full: The existing article.
- Tier 1 — Card (~50 words): Title + one-line description + key metric. Already roughly exists as
-
Authoring tooling: Extend
crux content improveto generate Tier 2 briefs from Tier 3 content. Run this across all pages to bootstrap. -
Render by context: Card view in search results and related-pages lists. Brief view as default for pages over 3,000 words (with "Read full article" link). Full view on click-through.
-
Quality control: Grade briefs separately from full articles. A bad brief on a good article is worse than no brief.
Pros: Highest quality — human-reviewed summaries at each tier. Each tier is a first-class piece of content. Cards improve search/browse experience even without changing page rendering.
Cons: Multiplies authoring work by ~2x. Need to keep tiers in sync when content changes. 625 pages is a lot of briefs to write (though LLM-assisted).
Recommendations
The approaches aren't mutually exclusive. Here's a phased plan:
Phase 1 — Improve what exists (Approach 1)
This is the highest-leverage, lowest-risk starting point:
- Restructure
llmSummaryinto a structured format (one-liner, key numbers, bottom line). Run thecruxauthoring pipeline to regenerate summaries for all pages. - Add an expand toggle to the InfoBox description (currently just CSS
line-clamp-3). - Standardize Quick Assessment tables across all page types (risks already have them).
- Create a
<KeyTakeaways>MDX component for hand-curated bullet summaries.
Phase 2 — Collapsible sections (Approach 2)
Once summaries are solid, make long pages less intimidating:
- Build a remark plugin that wraps H2 sections in
<CollapsibleSection>. - Default-collapse sections below Overview on pages over ~2,000 words.
- Add section word counts to collapsed headers.
- Store expansion state in localStorage.
Phase 3 — View mode toggle (Approach 3, partial)
Add a Summary/Full toggle (skip the "Standard" middle mode to reduce complexity):
- Summary mode shows structured summary + InfoBox + Key Takeaways only.
- Full mode shows the complete page (with collapsible sections from Phase 2).
- Persist preference. Support
?view=summaryURL parameter.
Phase 4 (optional) — Tiered authoring (Approach 5)
If Phase 1–3 show strong user demand for short content:
- Extend page templates to include a Tier 2 brief section.
- Use
crux content improveto auto-generate briefs, then human-review. - Render briefs as the default view for high-importance, high-length pages.
Skip for now: Approach 4 (per-user LLM rewriting). The cost, latency, and quality-control challenges outweigh the benefits given the wiki's static architecture. If the wiki moves to a server-rendered model for other reasons, revisit.
Technical Feasibility Notes
Existing infrastructure that helps:
- Radix UI
CollapsibleandTabscomponents are already in the codebase InfoBoxVisibilityProvidercontext establishes the toggle-with-localStorage patternDevModeToggleproves localStorage preferences work across sessionsextractHeadings()inmdx.tsalready parses H2/H3 structure at compile timebuild-data.mjsalready computes page-level metrics; extending to section-level is straightforwardcrux content improvecan be extended to generate structured summaries
What would need to be built:
- Phase 1: Structured summary YAML schema,
<KeyTakeaways>component, InfoBox expand toggle (~200 lines) - Phase 2: Remark plugin for section wrapping,
<CollapsibleSection>component, section metadata in build pipeline (~500 lines) - Phase 3:
<ViewModeProvider>context, header toggle, URL param handling (~300 lines)
Phase 1 Implementation Status
Phase 1 has been implemented. Here's what was built:
1. Structured Summary Schema
New structuredSummary frontmatter field with three sub-fields:
structuredSummary:
oneLiner: "One sentence explaining what this thing is"
keyPoints:
- "First key quantitative or qualitative finding"
- "Second key point"
- "Third key point"
bottomLine: "So-what for a decision-maker"
Files changed:
crux/lib/rules/frontmatter-schema.ts— Zod validation schemaapp/src/data/index.ts— TypeScriptPageinterfaceapp/scripts/build-data.mjs— Build pipeline passthrough
2. InfoBox Expand/Collapse Toggle
The InfoBox description was previously CSS-truncated to 3 lines (line-clamp-3) with no way to see the full text. Now it's interactive: if the text overflows 3 lines, a "Show more" / "Show less" button appears.
New file: app/src/components/wiki/InfoBoxDescription.tsx — Client component using useRef + useEffect to detect overflow, toggling line-clamp-3 on click.
3. KeyTakeaways MDX Component
A new <KeyTakeaways> component for hand-curated bullet summaries at the top of articles. Renders as a visually distinct card with an indigo accent (lightbulb icon + "Key Takeaways" header).
New file: app/src/components/wiki/KeyTakeaways.tsx
Registered in: app/src/components/mdx-components.tsx
4. Structured Summary Rendering in PageStatus
When a structuredSummary is available, PageStatus renders it as:
- One-liner in bold at the top
- Key points as a bullet list
- Bottom line in a highlighted indigo box with a "BOTTOM LINE" label
Falls back to the existing flat llmSummary paragraph when no structured summary exists.
5. Example Pages
Structured summaries and KeyTakeaways added to:
content/docs/knowledge-base/risks/scheming.mdx— Risk page with bothstructuredSummaryand<KeyTakeaways>content/docs/knowledge-base/responses/interpretability.mdx— Response page with both
Next Steps for Phase 1 Rollout
- Run
crux content improveacross all pages to auto-generatestructuredSummaryfields - Hand-review generated summaries for high-importance pages (importance >= 70)
- Add
<KeyTakeaways>to the top 30 most-visited pages - Consider making
structuredSummarya required field in page templates for new pages