Why does the model sometimes pick a less-accomplished player over a name?
Usually because of the small-sample shrinkage + aging-curve interaction. A young player with few grass matches gets their grass Elo pulled up toward their hard/clay overall (capped at 50%), and the aging curve credits players in the grass-peak window (roughly age 22-27). Combined, this can tip a close matchup toward the younger, less-grass-experienced player. The /wimbledon/match page shows exactly which driver is doing the work for any given match — if grass Elo is favoring the veteran but aging flips it, you'll see both contributions side by side.
Why grass-Elo shrinkage K=40 with a 50% cap?
Without shrinkage, sparse grass samples produce artifacts — Sinner's 39 grass matches gave a raw Elo ~280 points below his cross-surface overall, leaving him underrated entering 2026 Wimbledon despite being world #1. K=40 fixed that. But uncapped, K=40 over-corrected on players with <20 grass matches: Fils (11 matches) got pulled 78% toward overall, ending up higher on grass than Shelton (80 matches) — wrong. The 50% cap preserves Sinner's correction (his w was already 0.506) and stops the laundering for everyone else.
Why not steepen the aging tail to handle these cases?
Because that would be survivorship bias — steepening the aging tail to fix one player's projection is fitting noise. The shrinkage cap is OOS-testable and principled. The aging tail stays where the data put it.
How does the model update after each match?
Per-surface Elo updates with each completed match (standard Elo update with surface-specific K). Aging curves and archetype assignments are stable across the tournament. The Wimbledon refresh script (D:/Tennis Projections/scripts/refresh_during_wimbledon.py) pulls the previous day's results, re-derives Elo, re-runs the bracket MC, and re-deploys the bracket + per-pair odds + projected JSON. The /wimbledon page picks up the updated probabilities on the next page load.
What does the model NOT account for?
Injuries (a player carrying a thigh strain isn't surfaced in the projection — we only see match results, not in-tournament withdrawals or load management). Weather (grass plays differently in heat / humidity / under the roof). Match scheduling and accumulated fatigue (back-to-back five-setters). Court assignment effects (Centre vs Court 1 vs outside courts — bounce and feel differ). Recent off-court news (coaching changes, personal). Doubles partnerships (singles only for now). These all matter, but we don't ingest them and we won't pretend otherwise.
How should I read counterintuitive calls?
Open the /wimbledon/match page for that pairing. It shows the model's full breakdown: grass Elo (raw + shrunk + sample size), age + aging contribution, archetype matchup. If a call surprises you, one of those drivers is doing unusual work — usually a small-sample shrinkage on a younger player, or an archetype edge you didn't expect. The model is making a probabilistic statement, not a prediction; a 55% call still loses 45% of the time.
How was the K=40 shrinkage cap derived?
After Wimbledon 2026 draw posted, an audit of seven flagged matches found 5 of them shared the same defect: K=40 was overpowered on small-grass-sample players, hauling their grass Elo 60-80% toward hard-court overall. Capping at w_max=0.5 was the smallest change that fixed all 5 without disturbing the n>=40 specialists. Verified with re-run MC: 5 of 7 calls flipped to the intuitive winner; Sinner's champion% unchanged at 47.6%.
Is WTA the same fidelity as ATP?
Yes. Same engine class, same surface-specific Elo + cap, same per-archetype aging (WTA archetype clustering is K=4 vs ATP's K=7), same draw-aware sim. Equity-first ship — women's tennis is not a junior partner.
What about doubles?
Singles only for now. Doubles projection is on the roadmap but not yet shipped — when it ships, it'll have its own page + OOS calibration in release notes.