Czech Republic 1–1 South Africa: 14 AIs Backed the Favourite, None Saw the Draw
All 14 frontier AI models picked Czech Republic to beat South Africa — and all 14 read the game as low-scoring. The tempo call was perfect: every Under 2.5 pick won. The result call was not: it finished 1–1, nobody predicted the draw, and 13 of 14 were wrong on both-teams-to-score. Right tempo, wrong result.
Same shared brief, same public scoreboard, every pick graded the moment the whistle blew. This time the panel read the rhythm of the match almost perfectly — and still got the result wrong.
There is a particular kind of miss that keeps showing up in this World Cup. A favourite is installed, the AI panel lines up behind it with quiet confidence, and then the underdog refuses to read the script. We saw it when Portugal were held to a draw by DR Congo. We saw a different flavour of it when England beat Croatia 4–2 and every model missed the goals. Czech Republic versus South Africa added a third variation — and arguably the most instructive one yet.
The match finished 1–1. Czech Republic struck first, leading 1–0 at the break; South Africa levelled in the second half and held on. And across fourteen frontier AI models — every flagship and fast variant from OpenAI, Anthropic, Google, xAI and DeepSeek — not a single one predicted a draw. Every one of them backed Czech Republic to win.
The setup: a unanimous, low-conviction favourite
Each model received the identical pre-match brief — form, lineups, injuries, venue, market prices — and returned its own call on the same set of markets. On the headline question, who wins, the answer was as close to unanimous as this arena gets:
- 14 of 14 models picked Czech Republic to win outright.
- 0 picked South Africa. 0 picked the draw.
- Average confidence was a modest 58%, ranging from 50% to 65%.
That confidence number matters. Fifty-eight percent is not a thumping conviction — it is a panel that sees a favourite, but a soft one. The models were effectively saying: Czech Republic are more likely than not to win, but this is no formality. The honest read of that spread is that a draw was always live. Yet when each model had to commit to a single outcome, every one rounded up to a Czech win. Nobody banked the draw, even though the implied probabilities left plenty of room for it.
The correct-score picks told the same story with no ambiguity at all. Every model that called a scoreline called a Czech clean sheet:
- 1–0 Czech Republic — 10 models
- 2–0 Czech Republic — 4 models
- 1–1 — nobody
Fourteen exact-score predictions, fourteen Czech wins, zero draws. The actual result, 1–1, did not appear once.
What actually happened
Czech Republic did the thing the models expected first: they went ahead and took a 1–0 lead into half-time. For forty-five minutes the consensus looked sharp. Then South Africa equalised in the second half, the clean sheet evaporated, and a game the panel had filed under "narrow home win" became a 1–1 draw. The favourites never found the second goal that almost every model had priced in.
The part the AIs got right: the tempo
Here is where it gets interesting, and where this match separates itself from the England result. The panel did not misread the shape of the game. It misread the result. On total goals, the models were unanimous in the other direction — and completely correct:
- 14 of 14 took Under 2.5 goals. The match produced exactly two. Every single Under pick won.
So the panel read this as a tight, cagey, low-scoring affair — and it was. Two goals, both teams grinding, no avalanche. The collective instinct that "this will be tense and low-scoring" was spot on. The collective instinct that "and Czech Republic will edge it to nil" was not. The AIs got the temperature of the game right and the winner wrong, which is the opposite of the England night, where they got the winner right and the temperature wildly wrong.
The lone dissenter who got paid
On both-teams-to-score, the panel leaned hard one way: 13 of 14 models picked "No" — i.e. at least one side keeps a clean sheet. That is the natural partner to a 1–0/2–0 prediction. When South Africa equalised, all thirteen lost.
One model broke ranks and took "Yes, both teams to score." It was the only both-teams-to-score pick on the board that paid out. In an arena built on a shared brief, that is exactly the kind of divergence worth watching: not the consensus, but the model willing to price the messy, human outcome the others rounded away.
How every market graded
| Market | AI consensus | Result (1–1) | Verdict |
|---|---|---|---|
| Winner | Czech Republic — 14/14 | Draw | ❌ all 14 wrong |
| Total goals | Under 2.5 — 14/14 | 2 goals | ✅ all 14 right |
| Both teams to score | No — 13/14 | Yes | ❌ 13 wrong · ✅ 1 right |
| Correct score | 1–0 or 2–0 Czech | 1–1 | ❌ all wrong |
Across all settled markets on this match, the fourteen models combined for 15 winning picks, 66 losing picks and 17 voids. The wins were almost entirely the Under 2.5 calls. Strip the goals total out and the panel's read of this game was close to a clean sheet of losses — driven entirely by that single, shared, overconfident lean toward the favourite.
The pattern: AIs read shape better than they read results
Three World Cup nights, three lessons, and a thread running through all of them. Against Portugal, the models backed a favourite that drew. Against England, they backed a winner but expected a 1–0 grind and got a 4–2 shootout. Against Czech Republic, they nailed the low-scoring shape and missed the draw inside it.
What keeps recurring is a structural overconfidence in favourites. Frontier models are good — genuinely good — at the texture of a match: whether it will be open or tight, high-scoring or strangled. Where they consistently slip is in converting a soft, 55–60% edge into a single committed outcome. They round up. They take the clean win and the clean sheet, when the calibrated answer is "probably the favourite, but bank the draw as a real possibility." A 58% favourite loses or draws roughly two times in five. Czech Republic versus South Africa was one of those two.
This is precisely the kind of finding the project exists to surface. There are no hindsight edits, no cherry-picked wins, no quiet deletions. Every model's pick on this match was locked before kick-off from the same brief and graded against the final 1–1, in public. You can read the full per-model breakdown on the Czech Republic vs South Africa match page, see which models are actually beating the market over time on the live leaderboard, and follow every upcoming call as it locks on the predictions page.
The AIs heard the rhythm of this one perfectly. They just danced to the wrong ending.