Research & Discovery
What this is
Research and discovery is how we find new information for the wiki — searching the web, crawling RSS feeds, fetching pages from external APIs, and assembling that material into context bundles for page authoring or improvement. It is the first stage of the data lifecycle, before verification and storage.
When to start here
If you're trying to answer a question that needs current sources, draft a new page, or react to news, this is your starting point.
Live dashboards
| Dashboard | What it shows |
|---|---|
| Auto-Update News | News items pulled from RSS feeds, grouped by source |
| Auto-Update Runs | Pipeline run history — what was fetched, routed, and applied |
CLI playbook
The most useful commands, in rough order of frequency:
# Multi-source web search (Exa + Perplexity + SCRY + GitHub)
pnpm crux query search "compute governance"
pnpm crux query search "MIRI funding" --limit=5
# Assemble a research bundle for a specific page or topic
pnpm crux context for-page anthropic
pnpm crux context for-topic "RLHF alignment tax" --limit=15
pnpm crux context for-entity openai
# Auto-update news pipeline (RSS + web search → routing → page updates)
pnpm crux w auto-update digest # Fetch sources, show digest
pnpm crux w auto-update plan # Preview what would be updated
pnpm crux w auto-update run --budget=30 # Run with $30 budget cap
pnpm crux w auto-update sources # List configured sources
pnpm crux w auto-update sources --check # Test source URL reachability
pnpm crux w auto-update history # Show recent run history
When to use what
| I want to... | Run | Or check |
|---|---|---|
| Find pages/sources about a topic | pnpm crux query search "topic" | — |
| Gather context to draft a new page | pnpm crux context for-topic "topic" | Output in .claude/wip-context.md |
| See if there's recent news on something | pnpm crux w auto-update digest | Auto-Update News dashboard |
| Trigger a news-driven update of pages | pnpm crux w auto-update run --budget=30 | Auto-Update Runs dashboard |
| Add a new RSS / news source | Edit data/auto-update/sources.yaml, then pnpm crux w auto-update sources --check | — |
| Compare external search APIs (cost, quality) | — | External Search APIs |
Architecture & deep dives
- External Search APIs — Exa, Perplexity, SCRY, Firecrawl, GitHub, Semantic Scholar — pricing, integration points, what works and what doesn't
crux/lib/search/— research agent, providers, deduplicationcrux/auto-update/— news pipeline orchestration (fetch → digest → route → update)data/auto-update/sources.yaml— RSS feed and web-search source configuration.github/workflows/auto-update.yml— daily 06:00 UTC scheduled run