Chapter 47 Advanced ~45 min read

Injury Risk Modeling

Using analytics to assess and predict player injury risk.

The Injury Problem

Injuries represent the largest source of uncertainty in player evaluation and roster construction. A player projected for excellent performance provides zero value if injured. Understanding injury risk helps teams make better roster decisions and manage player workloads to reduce injury probability.

Injury Risk Factors

Research identifies several injury risk factors: injury history (past injuries predict future injuries), age (older players are more injury-prone), playing style (high-impact play increases risk), and workload (more minutes increase exposure). Analytics can quantify these relationships to estimate individual injury probability.

Load Management

Teams increasingly rest players to reduce injury risk, particularly during back-to-back games or condensed schedule periods. Analytics informs when rest provides greatest benefit by identifying when players are most vulnerable to injury based on accumulated workload and schedule context.

def assess_injury_risk(player_data):
    """Simple injury risk assessment model"""
    risk_factors = {
        'prior_injuries': player_data['GAMES_MISSED_INJURY'] / 82,
        'age_factor': max(0, (player_data['AGE'] - 27) * 0.05),
        'minutes_load': player_data['AVG_MIN'] / 36 * 0.2,
        'back_to_back': player_data['B2B_PLAYED'] / player_data['B2B_AVAILABLE']
    }

    base_risk = 0.10  # 10% baseline injury probability
    total_risk = base_risk + sum(risk_factors.values())

    return {
        'risk_factors': risk_factors,
        'total_risk': min(0.8, round(total_risk, 2))  # Cap at 80%
    }

Contract Implications

Injury risk should factor into contract valuation. A player with high injury probability is worth less than an equally skilled but durable player. Properly accounting for injury risk helps teams avoid overpaying for fragile players and identify undervalued durable options.

Implementation in R

# Playoff probability simulation
library(tidyverse)

simulate_season <- function(team_ratings, remaining_schedule, n_sims = 1000) {
  results <- map_dfr(1:n_sims, function(sim) {
    schedule <- remaining_schedule %>%
      mutate(
        # Win probability based on ratings
        home_wp = 1 / (1 + 10^((away_rating - home_rating - 3) / 10)),
        home_win = runif(n()) < home_wp
      )

    # Aggregate wins
    home_wins <- schedule %>%
      group_by(home_team) %>%
      summarise(wins = sum(home_win), .groups = "drop")

    away_wins <- schedule %>%
      group_by(away_team) %>%
      summarise(wins = sum(!home_win), .groups = "drop")

    # Combine
    total_wins <- full_join(
      home_wins %>% rename(team = home_team),
      away_wins %>% rename(team = away_team),
      by = "team"
    ) %>%
      mutate(total_wins = coalesce(wins.x, 0) + coalesce(wins.y, 0)) %>%
      select(team, total_wins) %>%
      mutate(sim = sim)
  })

  # Playoff probability
  results %>%
    group_by(team) %>%
    summarise(
      avg_wins = mean(total_wins),
      playoff_prob = mean(total_wins >= playoff_threshold) * 100,
      .groups = "drop"
    )
}

ratings <- read_csv("team_ratings.csv")
schedule <- read_csv("remaining_schedule.csv")
playoff_probs <- simulate_season(ratings, schedule)
print(playoff_probs)

Implementation in R

# Playoff probability simulation
library(tidyverse)

simulate_season <- function(team_ratings, remaining_schedule, n_sims = 1000) {
  results <- map_dfr(1:n_sims, function(sim) {
    schedule <- remaining_schedule %>%
      mutate(
        # Win probability based on ratings
        home_wp = 1 / (1 + 10^((away_rating - home_rating - 3) / 10)),
        home_win = runif(n()) < home_wp
      )

    # Aggregate wins
    home_wins <- schedule %>%
      group_by(home_team) %>%
      summarise(wins = sum(home_win), .groups = "drop")

    away_wins <- schedule %>%
      group_by(away_team) %>%
      summarise(wins = sum(!home_win), .groups = "drop")

    # Combine
    total_wins <- full_join(
      home_wins %>% rename(team = home_team),
      away_wins %>% rename(team = away_team),
      by = "team"
    ) %>%
      mutate(total_wins = coalesce(wins.x, 0) + coalesce(wins.y, 0)) %>%
      select(team, total_wins) %>%
      mutate(sim = sim)
  })

  # Playoff probability
  results %>%
    group_by(team) %>%
    summarise(
      avg_wins = mean(total_wins),
      playoff_prob = mean(total_wins >= playoff_threshold) * 100,
      .groups = "drop"
    )
}

ratings <- read_csv("team_ratings.csv")
schedule <- read_csv("remaining_schedule.csv")
playoff_probs <- simulate_season(ratings, schedule)
print(playoff_probs)
Chapter Summary

You've completed Chapter 47: Injury Risk Modeling.