Chapter 44 Advanced ~50 min read

Game Outcome Prediction

Building models to predict NBA game winners and point spreads.

Modeling Game Outcomes

Game outcome prediction typically uses team strength estimates, home court advantage, and situational factors (rest, travel, injuries) to forecast results. The fundamental approach regresses historical outcomes on these inputs to learn how each factor predicts winning.

Team Strength Estimation

The foundation of game prediction is accurate team strength measurement. Elo ratings, Simple Rating System (SRS), and point-differential-based power rankings all attempt to quantify team quality. More sophisticated approaches use margin-adjusted results with opponent adjustments to estimate true team strength.

def simple_game_prediction(home_rating, away_rating, hca=3.0):
    """Predict game outcome using team ratings and home court advantage"""
    predicted_margin = home_rating - away_rating + hca

    # Convert margin to win probability
    # Approximately: each point of margin = 3.5% win probability
    win_prob = 0.5 + (predicted_margin * 0.035)
    win_prob = max(0.05, min(0.95, win_prob))  # Bound probability

    return {
        'predicted_margin': round(predicted_margin, 1),
        'home_win_prob': round(win_prob, 3)
    }

Incorporating Injuries

Injuries significantly affect game predictions but are challenging to incorporate systematically. Player impact estimates allow converting injury news to team strength adjustments. When a player who provides +5 points per 100 possessions is out, the team's expected performance drops accordingly.

Prediction Accuracy

State-of-the-art game prediction models achieve approximately 68-70% accuracy against the spread on a season sample. This represents only modest edge over random chance (50%), highlighting basketball's inherent unpredictability. Models that claim dramatically higher accuracy are likely overfitting or measuring on non-representative samples.

Implementation in R

# Player performance projection
library(tidyverse)

project_player_stats <- function(career_data, player_ages) {
  # Marcel-style projection
  career_data %>%
    arrange(player_id, desc(season)) %>%
    group_by(player_id) %>%
    slice_head(n = 3) %>%
    summarise(
      seasons = n(),
      # Weighted average (recent seasons weighted more)
      proj_pts = weighted.mean(pts, c(5, 3, 2)[1:n()]),
      proj_reb = weighted.mean(reb, c(5, 3, 2)[1:n()]),
      proj_ast = weighted.mean(ast, c(5, 3, 2)[1:n()]),
      .groups = "drop"
    ) %>%
    left_join(player_ages, by = "player_id") %>%
    mutate(
      # Age adjustment
      age_factor = case_when(
        age < 24 ~ 1.05,  # Improvement expected
        age <= 28 ~ 1.00, # Prime
        age <= 32 ~ 0.97, # Slight decline
        TRUE ~ 0.93       # Decline
      ),
      proj_pts = proj_pts * age_factor,
      proj_reb = proj_reb * age_factor,
      proj_ast = proj_ast * age_factor
    )
}

career <- read_csv("player_career_stats.csv")
ages <- read_csv("player_ages.csv")

projections <- project_player_stats(career, ages)
print(projections)

Implementation in R

# Player performance projection
library(tidyverse)

project_player_stats <- function(career_data, player_ages) {
  # Marcel-style projection
  career_data %>%
    arrange(player_id, desc(season)) %>%
    group_by(player_id) %>%
    slice_head(n = 3) %>%
    summarise(
      seasons = n(),
      # Weighted average (recent seasons weighted more)
      proj_pts = weighted.mean(pts, c(5, 3, 2)[1:n()]),
      proj_reb = weighted.mean(reb, c(5, 3, 2)[1:n()]),
      proj_ast = weighted.mean(ast, c(5, 3, 2)[1:n()]),
      .groups = "drop"
    ) %>%
    left_join(player_ages, by = "player_id") %>%
    mutate(
      # Age adjustment
      age_factor = case_when(
        age < 24 ~ 1.05,  # Improvement expected
        age <= 28 ~ 1.00, # Prime
        age <= 32 ~ 0.97, # Slight decline
        TRUE ~ 0.93       # Decline
      ),
      proj_pts = proj_pts * age_factor,
      proj_reb = proj_reb * age_factor,
      proj_ast = proj_ast * age_factor
    )
}

career <- read_csv("player_career_stats.csv")
ages <- read_csv("player_ages.csv")

projections <- project_player_stats(career, ages)
print(projections)
Chapter Summary

You've completed Chapter 44: Game Outcome Prediction.