Canada 6-0 Qatar: The AI Panel Nailed the Winner and Whiffed Every Scoreline
Eleven frontier AI models unanimously backed Canada against Qatar. They were right on the winner and wrong on the magnitude: Canada won 6-0 while every scoreline guess landed at 2-0 or 2-1.
Canada 6-0 Qatar is the kind of result that flatters and exposes an AI panel at the same time. Every one of the eleven models we polled before kickoff picked Canada to win, and every one of them was right. But not a single model came within four goals of the actual scoreline. The panel saw the favourite. It badly underestimated the favourite.
A rare unanimous board
Most matches split the room. This one did not. All 11 models in our predictions pool landed on the same side: Canada. The consensus pick was Canada with a consensus count of 11 of 11 — a clean sweep, no dissent, no contrarian flag planted on Qatar.
The confidence numbers tell the more interesting story. This was not a board full of swagger. The models clustered in the mid-70s: Claude Opus 4.7 sat lowest at 74, with Gemini 2.5 Flash-Lite, Gemini 2.5 Pro, GPT-5 Mini, DeepSeek V3 and Gemini 2.5 Flash all parked at 75. Claude Haiku 4.5 and Claude Sonnet 4.6 nudged to 76, Grok 4 Fast and Gemini 3.1 Pro to 78. The clear outlier was GPT-4o Mini, which alone pushed its conviction to 85.
That spread matters. Unanimity on the side does not mean unanimity on the strength of the read, and a board that agrees on direction while hedging on confidence is a board that suspects a comfortable win without daring to call a rout.
What actually happened
Canada won 6-0. Not a narrow grind, not a 2-0 game-management job — a six-goal demolition with a clean sheet. The head-to-head call was correct across the board: 11 of 11 models tipped the winner correctly, a perfect 11/11 on the result.
So on the only binary that the leaderboard rewards for this market, the panel was flawless. There is no asterisk on the winner. The asterisk is everywhere else.
Who got it right, who got it wrong
On the winner, nobody got it wrong. Gemini 2.5 Flash-Lite, Claude Haiku 4.5, Grok 4 Fast, Gemini 2.5 Pro, Gemini 3.1 Pro, GPT-5 Mini, Claude Sonnet 4.6, DeepSeek V3, GPT-4o Mini, Gemini 2.5 Flash and Claude Opus 4.7 all banked the correct call. Every Gemini variant, every Claude tier, both GPT entries, Grok and DeepSeek — the entire frontier agreed and the entire frontier was vindicated.
If there is a quiet winner here, it is GPT-4o Mini, the one model that refused to hedge. At 85 confidence it was the boldest voice in the room, and a 6-0 result rewards boldness: of everyone who picked Canada, it was the least apologetic about it. The lowest-confidence read belonged to Claude Opus 4.7 at 74 — correct, but the most cautious framing of a match that turned into the least cautious result of the round.
The honest caveat: when a board is unanimous, “who was sharp” collapses into “who was confident.” There is no contrarian to credit, no blind model to call out on the winner. The real separation only shows up once you ask the harder question — not who wins, but by how much.
The correct-score angle: a collective miss
This is where the panel comes undone. Every model submitted a correct-score guess, and every model scored zero points on it. Eight of the eleven — Grok 4 Fast, Gemini 2.5 Pro, Claude Sonnet 4.6, GPT-4o Mini, Claude Haiku 4.5, Claude Opus 4.7, Gemini 3.1 Pro and Gemini 2.5 Flash-Lite, plus Gemini 2.5 Flash — landed on the exact same scoreline: 2-0 Canada. GPT-5 Mini went lower with 1-0. DeepSeek V3 was the only model to predict a Qatar goal at all, with 2-1.
Not one of those guesses survived contact with a 6-0 result. The clustering is the tell: a board that converges on 2-0 is a board that has correctly identified the favourite and then anchored hard on a “routine professional win” template. The models priced in dominance but capped it at two goals. They had no mechanism for a blowout, and the match delivered exactly that.
DeepSeek V3 deserves a footnote for being the lone dissenter on the clean sheet, predicting Qatar would score. It was wrong — Canada kept the shutout — so even the contrarian read missed, just in a different direction from everyone else.
Winner: consensus vs result
| Market | AI consensus | Actual result | Verdict |
|---|---|---|---|
| Match winner | Canada (11 of 11) | Canada won 6-0 | ✓ |
| Correct score (most common) | 2-0 Canada (9 of 11) | 6-0 Canada | ✗ |
| Clean sheet | Implied by 10 of 11 (only DeepSeek V3 tipped 2-1) | Qatar kept off the board | ✓ |
The broader pattern
Canada vs Qatar is a clean illustration of where frontier models are strong and where they are systematically soft. Picking the winner of a lopsided fixture is the easy half, and the AIs do it well — 11 of 11 here, no hindsight edits required. Predicting how lopsided is the hard half, and the models keep flinching from the extremes. Faced with a heavy favourite, they reach for 2-0 almost reflexively, because 2-0 is the safe centre of the distribution and a blowout lives in the tail they are reluctant to commit to.
That reluctance is the recurring signature across our board: confident on direction, conservative on magnitude, allergic to the rout. Canada gave them every reason to be bold and they still defaulted to two goals. The winner column says the panel called it. The scoreline column says they only called half of it.
See the full model-by-model breakdown on the Canada vs Qatar match page, and track which models are sharpest over the tournament on the leaderboard.