What We Got Right and Wrong: Trump's Iran Ultimatum
On April 5, we published four scenarios for Trump's third ultimatum. Our #1 scenario — partial escalation at 40% — was correct. On April 7, the US struck Kharg Island and bombed bridges across Iran. Power plants were threatened but spared. This is our first retrotrack: what we predicted vs what actually happened, scored transparently.
Predicted vs Actual: D/E/M/A Comparison
The spider chart below overlays our April 5 prediction model (dashed lines) against what actually unfolded on April 7 (solid lines). Key shifts: Diplomacy collapsed further than predicted. Gulf State impact was higher than expected. Aleatoric risk (A) increased significantly — Iran's "restraint is over" creates unpredictable escalation paths we couldn't model.
What We Predicted (April 5)
| Scenario | Our Probability | Rank | Outcome |
|---|---|---|---|
| Partial escalation — limited strikes | 40% | #1 | ✓ THIS HAPPENED |
| Full "all hell" — major infrastructure | 25% | #2 | Not yet (power plants standing) |
| Deadline extended again | 25% | #2 | ✗ Did not happen |
| Diplomatic breakthrough | 10% | #4 | ✗ Did not happen |
What Actually Happened (April 7)
Kharg Island struck — Iran's main oil export hub (90% of exports). Military targets, air defenses, naval base, mine storage.
Bridges bombed — Kashan, Tabriz, Qom, Karaj. At least 2 killed.
Power plants NOT hit — threatened but spared. The war-crime line not yet crossed.
Trump: "A whole civilization will die tonight."
Iran: "Restraint is over."
◆ Scoring
| Metric | Value | Meaning |
|---|---|---|
| Log Loss | 0.916 | −log(0.40) = 0.916. Right outcome, only 40% confident. Should have been 55–60%. |
| Hit Rate | 1/1 = 100% | Our #1 ranked scenario was correct. |
| Calibration | Under-confident | Direction right, magnitude wrong. Need sharper probabilities. |
What We Got Right
1. Partial escalation was most likely. We correctly identified that Trump would act but not go "full hell" immediately. Kharg Island strikes = maximum economic pressure without crossing the civilian infrastructure war-crime line.
2. Trump threat pattern holds. His historical pattern: threats exceed action. "Fire and fury" for NK was never executed. Suleimani was. Third ultimatum split the difference — action, but calibrated.
3. Iran won't back down. We predicted Iran would reject the ultimatum. They called it "helpless, nervous, unbalanced and stupid." Then declared "restraint is over."
What We Got Wrong
Under-weighted Trump's credibility trap
We gave 25% to "deadline extended again." Should have been 10–15%. This was his third ultimatum — extending again would have been humiliating. The F-15 shootdown added domestic pressure. Inaction after a downed American jet was politically impossible.
Under-weighted partial escalation
40% was too low. Should have been 55–60%. The military had targets ready. Kharg Island = maximum economic pressure without war crimes. Path of least resistance for a president who needed to look strong.
Over-weighted diplomatic breakthrough
10% for diplomacy was too high. Iran publicly rejected every proposal. Trump said 45-day ceasefire was "not good enough." No back-channel signals. Should have been 3–5%.
What We Should Have Said
| Scenario | We Said | Should Have Been | Error |
|---|---|---|---|
| Partial escalation | 40% | 55–60% | Under-confident by 15–20pts |
| Full "all hell" | 25% | 25–30% | About right |
| Deadline extended | 25% | 10–15% | Over-weighted by 10–15pts |
| Diplomatic breakthrough | 10% | 3–5% | Over-weighted by 5–7pts |
Lessons Learned
Lesson 1: Credibility compounds
Each repeated threat without action damages credibility. But there's a threshold where inaction becomes more costly than action. Third ultimatum was that threshold.
Lesson 2: Domestic pressure matters
The F-15 shootdown created political pressure we under-weighted. First major visible loss creates demand for response. The domestic political cost of inaction exceeded the cost of action.
Lesson 3: Middle options are often most likely
Between "all hell" and "nothing," there's usually a calibrated response. Military planners prefer proportional escalation. We spread probability too evenly across extreme scenarios.
Verdict
Direction: correct. Confidence: insufficient. Lessons: three concrete adjustments.
Our first scored prediction got the most important thing right: partial escalation was the most likely outcome, and it happened. But 40% wasn't confident enough. The errors were in the tail distribution: too much weight on "extension" and "diplomacy," not enough on the scenario that had the most evidence behind it.
One prediction doesn't make a track record. But it's a start — and we publish the misses alongside the hits.
Scoring: Log Loss 0.916 | Hit Rate 100% (1/1) | Cumulative predictions scored: 1
Framework: D/E/M/A. Original: iran-trump-ultimatum-2026-001 (April 5, 2026).
References
Every prediction scored. Every miss published. This is the product.
See all insights → How D/E/M/A works