Model diversity matters more
than modelquality.Three different models debating beats three instances of the best model. The adversarial pressure is the feature. The moderator finds where they agree, where they disagree, and why.
Validated at ICML 2024 (Best Paper), NeurIPS 2024, and ICLR 2025
Your AI is a yes-man
Presets or build your own
Start with a curated council of models and roles — or pick exactly which models debate and what perspective each one takes.
Backed by peer-reviewed science
Multi-model debate isn't a hypothesis. It's the mechanism behind the most accurate AI reasoning ever measured.
Non-expert judges improved from 48% → 76% accuracy when evaluating debated answers vs single-model responses
Khan et al. · UCL + Anthropic · ICML 2024 Best PaperMulti-agent debate improved math reasoning from 67% → 81.8%. Models correct each other through sequential challenge rounds
Du et al. · MIT + DeepMind · ICML 2024Mixture-of-Agents: open-source models collaborating scored 65.1% vs GPT-4 Omni's 57.5% — proving collective reasoning beats individual capability
Wang et al. · Together AI + Stanford · ICLR 2025Weak LLM judges supervising strong LLMs via debate outperformed direct questioning on every task tested — scalable oversight works
Kenton et al. · Google DeepMind · NeurIPS 2024“Two sets of findings released in 2024 offer the first empirical evidence that debate between two LLMs helps a judge recognize the truth.”
— Quanta Magazine, March 2025
Read the research
The peer-reviewed papers behind multi-model deliberation — from UCL, Anthropic, Google DeepMind, MIT, and leading AI labs.
Built for High-Stakes Decisions
Roundtable is designed for confidential, critical work — with full traceability, data privacy, and IDE integration.
Full Traceability
Every tool call logged with model attribution and reasoning chain. When the council says "refactor," you can trace which model proposed it, which challenged it, and why the verdict stands.
Your Data Stays Private
API calls are excluded from model training by every provider we route through. Your data stays private and encrypted via HTTPS on Cloudflare's global network.
Human-in-the-Loop
AI deliberates. You decide. Every verdict includes the reasoning so you can override with confidence. The council argues the tradeoffs — you make the call.
Works in Your IDE
Run council deliberations directly in Claude Code, Cursor, Windsurf, or any MCP-compatible IDE. No context switching — debate where you build.
Built for Every High-Stakes Decision
Frequently asked questions
Your AI Council Is Ready
Stop asking one model and hoping it's right. Assemble a council, start the debate.
