We Asked 10 Frontier AI Models Who Wins Canada vs Bosnia at the World Cup. They Agreed on Almost Everything.
Ten frontier AI models — Claude Opus 4.7, GPT-5 Mini, Grok 4 Fast, Gemini 2.5 Pro and more — were given the same brief on Canada vs Bosnia at the 2026 FIFA World Cup. Seven out of eight picked Canada. Every single one picked Under 2.5 goals. Here is the full per-market table and what the consensus is signalling.
Same brief. Same scoreboard. Public results — no cherry-picking.
TL;DR
Ten frontier AI models — Anthropic's Claude Opus 4.7, Opus 4.6, Sonnet 4.6 and Haiku 4.5, OpenAI's GPT-5 Mini and GPT-4o Mini, xAI's Grok 4 Fast, Google's Gemini 2.5 Pro and Flash-Lite, and DeepSeek V3 — were given the same minimal brief on Canada vs Bosnia & Herzegovina (FIFA World Cup, Friday June 12, 19:00 UTC) and asked to call six markets.
Nine of ten picked Canada to win. Every single model picked Under 2.5 goals and Both Teams to Score: No. The bookmaker has Canada at 1.85 (~54% implied) — the AI consensus runs 48-58%, almost no edge on the moneyline. The real signal is in the totals: every model sees a tense, low-scoring game.
The only contrarian on the moneyline is Google's Gemini 2.5 Pro, which picks a draw and Bosnia to cover the spread.
See the live picks and reasoning →
The match-up
Canada vs Bosnia & Herzegovina, FIFA World Cup group stage, Friday June 12, 19:00 UTC.
Bookmaker consensus at the time the AIs were called:
| Outcome | Decimal odds | Implied probability |
|---|---|---|
| Canada win | 1.85 | 54.0% |
| Draw | 3.65 | 27.4% |
| Bosnia win | 4.20 | 23.8% |
Canada are favourites at home — but not heavy ones. A draw is realistic, an upset is not unthinkable. This is the kind of fixture where ten frontier models, working from the same minimal brief, should be free to disagree.
They mostly didn't.
What every AI picked
| Model | Match Winner | Over/Under 2.5 | BTTS | Spread (−1) | Asian Handicap | HT/FT |
|---|---|---|---|---|---|---|
| Claude Opus 4.7Flagship Anthropic |
Canada50% | Under58% | No54% | Bosnia +162% | Canada −0.550% | Draw/Canada22% |
| Claude Opus 4.6Flagship Anthropic |
Canada48% | Under62% | No54% | Bosnia +158% | Bosnia +0.552% | Draw/Canada26% |
| Claude Sonnet 4.6 Anthropic |
Canada55% | Under60% | No58% | Canada −150% | Canada −0.555% | Canada/Canada35% |
| Claude Haiku 4.5 Anthropic |
Canada55% | Under60% | No58% | Canada −145% | Canada −0.555% | Canada/Canada30% |
| GPT-5 Mini OpenAI |
Canada58% | Under68% | No58% | Bosnia72% | Canada −0.560% | Draw/Canada40% |
| GPT-4o Mini OpenAI |
Canada55% | Under60% | No65% | Canada55% | Canada −0.555% | Home/Home55% |
| Grok 4 Fast xAI |
Canada55% | Under52% | No53% | Canada −151% | Canada −0.554% | Canada/Canada38% |
| Gemini 2.5 ProFlagship |
Draw40% | Under65% | No58% | Bosnia75% | — | — |
| Gemini 2.5 Flash-Lite |
Canada57% | Under60% | No55% | Canada53% | Canada −0.558% | Draw/Draw30% |
| DeepSeek V3 DeepSeek |
Canada55% | Under58% | No55% | Canada −130% | Canada −0.555% | Canada/Canada35% |
Green cells mark the arena consensus on that market. Red cells mark the contrarian. Percentages show each model's stated confidence in its own pick.
The one thing every AI agrees on: this game stays under 2.5 goals
It is genuinely rare for ten independently-prompted models to converge on the same direction for two correlated markets:
- Under 2.5 — 10/10 models, confidence range 52% to 68%
- BTTS No — 10/10 models, confidence range 53% to 65%
The mean AI implied probability for Under 2.5 is ~60%. If the bookmaker's Over price implies ~54% on Under, the AI consensus is calling a 6-point edge on the totals. That is the only market where the arena's collective signal is loud.
This kind of cross-market agreement is what makes AI consensus useful as a signal, not just a prediction. When models disagree, it usually means the data is genuinely ambiguous. When they unanimously stack a low-scoring read across two different markets, it usually means they're seeing the same defensive matchup pattern — and the book hasn't fully priced it.
The contrarian: Gemini 2.5 Pro picks the draw
Nine models picked Canada straight up. Google's Gemini 2.5 Pro picked Draw at 40% confidence — and went further, putting 75% confidence on Bosnia covering the spread.
Two interesting things about that:
- Gemini 2.5 Pro is the only flagship reasoning model that dissented. When the cheap models all converge and the most expensive model dissents, the dissent is worth reading.
- Gemini's pick is directionally aligned with the bookmaker tail — the draw is the 27% market outcome the recreational book absorbs the least. If Gemini is right, both the draw price and the Bosnia +1 line look like value.
We're not making a recommendation. We're flagging that the most powerful Gemini model in the arena disagrees with the consensus, and that disagreement targets the bookmaker's most under-bet outcome.
Where they break down: the spread
The moneyline is a Canada landslide. The totals are unanimous. The spread is where the arena splits.
Three different reads on the same −1 line:
- Canada −1 / Canada covers — Sonnet 4.6, Haiku 4.5, GPT-4o Mini, Grok 4 Fast, Flash-Lite, DeepSeek V3 (six models)
- Bosnia +1 / Canada doesn't cover — Opus 4.7, Opus 4.6, GPT-5 Mini (three models)
- Bosnia outright covers — Gemini 2.5 Pro (one model)
The pattern is informative: every model that picked Canada to win agreed Canada wins it 1-0 or 2-1 — exactly the result that doesn't cover a −1 line. That maps cleanly onto the Under 2.5 unanimity. If the game ends 1-0 Canada (the modal AI prediction), the moneyline pays out, the totals pay out, and the spread doesn't.
How the experiment works
ModelFights gives every AI the same brief — sport, teams, kickoff, venue, bookmaker odds, markets to predict — and tells them to do their own research with whatever tools they have. Each pick is timestamped, the bookmaker line is frozen at call time, and results are graded the moment the match settles.
No cherry-picking. No editorial sleight of hand. The same prompt goes to every model and every pick lands on a public scoreboard. If a model gets it wrong, the leaderboard says so.
For this World Cup window, premium models are free for everyone — usually Claude Opus 4.7 and Gemini 2.5 Pro sit behind a Pro tier. We're opening them up for the tournament so every fan can see the full frontier-model lineup on the matches that matter.
The bottom line
Ten frontier AI models, called within minutes of each other, agreed Canada wins this game in a low-scoring 1-0 or 2-1 type result.
The market is offering Under 2.5 at roughly fair odds, the moneyline at no edge, and the draw at a fade-able 27%. The arena consensus thinks the draw is underpriced — though only Gemini 2.5 Pro is willing to bet on it.
This is a textbook ModelFights matchup: low variance, high cross-market agreement, one expensive contrarian. Watch what actually happens at 19:00 UTC and the leaderboard tells you which models read the game right.