🏒 NHLForecasts.com
Data-Driven NHL Predictions & Analytics
2025-26 Season Live

Enhancing In-Game Win Probability with Expected Goals

On November 25, 2025, we completed a major upgrade to our in-game win probability model by integrating Expected Goals (xG) metrics. This enhancement allows our model to consider shot quality alongside actual scoring, providing more nuanced and accurate win probability curves throughout each game.

Two-Phase Development

The integration happened in two critical phases:

Phase 1: Critical Bug Fixes (Nov 25, 12:04 UTC)

Before integration, we discovered and fixed two critical bugs in our xG model that were affecting prediction accuracy:

The Angle Calculation Bug

Our shot angle calculation was fundamentally broken. We were using atan(dy/dx), which failed to properly handle all quadrants of the ice.

Impact: Shots from behind the net were being treated as high-quality slot shots: - Behind net (95, 0): Calculated as 0° (slot quality) - Should be 180° (behind net, very low quality)

The Fix: We switched to atan2(dy, dx) for proper quadrant handling across the entire rink. This ensures shots from different ice positions get appropriate angle assessments.

# Before (buggy)
angle = math.atan(dy / dx)  # Fails for different quadrants

# After (fixed)
angle = math.atan2(dy, dx)  # Properly handles all positions

This fix is implemented in nhl-revamp/shot_data.py lines 208-234.

Enhanced Strength State Detection

Previously, our model could only distinguish between "Even" and "NonEven" situations. This meant we couldn't tell the difference between: - High-danger power play shots (5-on-4) - Low-danger penalty kill shots (4-on-5) - Empty net situations

The Fix: We enhanced the strength state mapping to parse the NHL API's situation codes and determine the shooting team's context:

  • Even: 5-on-5 or equal strength
  • PowerPlay (PP): Team has man advantage
  • PenaltyKill (PK): Team is short-handed
  • EmptyNet: Opponent pulled their goalie
  • Other: Unusual situations

This enhancement dramatically improved our xG model's ability to assess shot quality in different game situations. The implementation is in nhl-revamp/expected_goals.py lines 104-148.

Current Performance (Temporal Training): - ROC AUC: 0.755 (excellent discrimination) - Log Loss: 0.218 (well-calibrated probabilities) - Brier Score: 0.058 (low prediction error) - Training data: 141,407 shots with 7.16% goal rate - Temporal split: 47K training, 12K test (chronological) - Production accuracy: 0.983 finishing rate (near-perfect calibration)

Note: December 2025 update implements temporal training (Issue #20) to prevent data leakage, providing honest performance metrics that reflect true predictive capability

Phase 2: In-Game Integration (Nov 25, 17:08 UTC)

With the xG model fixed and validated, we integrated it into our in-game win probability model by adding three new features:

  1. home_xg: Cumulative expected goals for home team
  2. away_xg: Cumulative expected goals for away team
  3. xg_diff: Home xG minus Away xG

Technical Implementation

The integration required careful feature engineering to maintain prediction quality:

Batch xG Calculation

Rather than calculating xG for each shot individually during game progression, we pre-calculate xG probabilities for all shot events in a single batch. This is more efficient and ensures consistent predictions.

Cumulative Tracking

As we walk through each game event chronologically, we accumulate xG values for both teams. After each event (goal, shot, penalty, etc.), we calculate: - Total expected goals generated by the home team so far - Total expected goals generated by the away team so far - The differential between them

Event-Level Merging

To connect shot data with game events, we merge on both game_id and event_id from the NHL API. This ensures each shot event carries its xG value through the feature pipeline.

The implementation spans multiple files: - Feature engineering: nhl-revamp/in_game/features.py lines 24-110 - Shot data merging: nhl-revamp/in_game/model.py lines 80-114

Model Performance

The enhanced in-game model now has 13 total features (up from 10):

Existing features: - pregame_home_win_prob - score_diff, period, time_remaining, is_overtime - time_remaining_pct, lead_change - score_diff_x_time, score_diff_squared, abs_score_diff

New xG features: - home_xg, away_xg, xg_diff

Despite the added complexity, the model maintains excellent calibration: - Brier Score: 0.1262 (well-calibrated) - Log Loss: 0.3862 (strong predictive accuracy) - Training: 2023-2024 and 2024-2025 seasons

Real-World Example: Game 2024020011

Let's see how xG integration affects win probability in a real game where the home team won 6-4:

Period Time Score Home xG Away xG Win Prob
1 09:52 0-1 0.07 0.11 51.7%
1 13:24 2-2 0.37 0.46 53.5%
1 18:23 3-2 0.85 0.46 78.7%
2 16:49 4-2 1.70 1.02 96.0%
3 17:05 6-4 2.85 2.42 97.9%

Final: 6-4 score with 2.85 vs 2.54 xG

Key Insight: At 09:52 in the first period, the home team was down 0-1 but had slightly better shot quality (0.07 vs 0.11 xG). The model assigned a 51.7% win probability - barely below 50% despite being down a goal.

This demonstrates how xG provides crucial context: the home team wasn't being dominated in shot quality, so the model correctly recognized this as a close game rather than panic over the one-goal deficit.

Benefits of xG Integration

1. More Nuanced Predictions

Win probability now considers shot quality, not just the scoreboard. A team can be down 2-1 but have accumulated 2.5 xG to their opponent's 0.8, suggesting they're creating better chances and may deserve a higher win probability than the score suggests.

2. Better Game State Representation

Situations like "down 0-1 but out-shooting opponent with quality chances" are now properly distinguished from "down 0-1 and being dominated."

3. Industry-Standard Alignment

Expected Goals is the gold standard for shot quality in hockey analytics. Our integration aligns us with modern sports analytics practices used by NHL teams and leading analytics platforms.

4. Maintained Calibration

Despite adding three new features, our model's calibration metrics (Brier score, log loss) remain excellent. The xG features enhance predictions without introducing instability.

5. Automatic Integration

The xG columns (home_xg, away_xg, xg_diff) are automatically included in all prediction outputs and in-game dashboards. No additional code changes needed for downstream consumers.

What's Next

This integration represents a significant step forward in our in-game analytics. Future enhancements we're exploring include:

  • Shot location heatmaps overlaid on win probability curves
  • Player-level xG contributions during critical game moments
  • xG-based "luck index" identifying games where scoring diverged from shot quality
  • Integration of xG metrics into our pregame prediction models

The xG integration demonstrates our commitment to incorporating modern analytics while maintaining the transparency and calibration quality our users expect. You can explore these enhanced win probability curves in the in-game dashboards for any completed game this season.

← Back to Articles