The Injury Problem
Injuries represent the largest source of uncertainty in player evaluation and roster construction. A player projected for excellent performance provides zero value if injured. Understanding injury risk helps teams make better roster decisions and manage player workloads to reduce injury probability.
Injury Risk Factors
Research identifies several injury risk factors: injury history (past injuries predict future injuries), age (older players are more injury-prone), playing style (high-impact play increases risk), and workload (more minutes increase exposure). Analytics can quantify these relationships to estimate individual injury probability.
Load Management
Teams increasingly rest players to reduce injury risk, particularly during back-to-back games or condensed schedule periods. Analytics informs when rest provides greatest benefit by identifying when players are most vulnerable to injury based on accumulated workload and schedule context.
def assess_injury_risk(player_data):
"""Simple injury risk assessment model"""
risk_factors = {
'prior_injuries': player_data['GAMES_MISSED_INJURY'] / 82,
'age_factor': max(0, (player_data['AGE'] - 27) * 0.05),
'minutes_load': player_data['AVG_MIN'] / 36 * 0.2,
'back_to_back': player_data['B2B_PLAYED'] / player_data['B2B_AVAILABLE']
}
base_risk = 0.10 # 10% baseline injury probability
total_risk = base_risk + sum(risk_factors.values())
return {
'risk_factors': risk_factors,
'total_risk': min(0.8, round(total_risk, 2)) # Cap at 80%
}
Contract Implications
Injury risk should factor into contract valuation. A player with high injury probability is worth less than an equally skilled but durable player. Properly accounting for injury risk helps teams avoid overpaying for fragile players and identify undervalued durable options.
Implementation in R
# Playoff probability simulation
library(tidyverse)
simulate_season <- function(team_ratings, remaining_schedule, n_sims = 1000) {
results <- map_dfr(1:n_sims, function(sim) {
schedule <- remaining_schedule %>%
mutate(
# Win probability based on ratings
home_wp = 1 / (1 + 10^((away_rating - home_rating - 3) / 10)),
home_win = runif(n()) < home_wp
)
# Aggregate wins
home_wins <- schedule %>%
group_by(home_team) %>%
summarise(wins = sum(home_win), .groups = "drop")
away_wins <- schedule %>%
group_by(away_team) %>%
summarise(wins = sum(!home_win), .groups = "drop")
# Combine
total_wins <- full_join(
home_wins %>% rename(team = home_team),
away_wins %>% rename(team = away_team),
by = "team"
) %>%
mutate(total_wins = coalesce(wins.x, 0) + coalesce(wins.y, 0)) %>%
select(team, total_wins) %>%
mutate(sim = sim)
})
# Playoff probability
results %>%
group_by(team) %>%
summarise(
avg_wins = mean(total_wins),
playoff_prob = mean(total_wins >= playoff_threshold) * 100,
.groups = "drop"
)
}
ratings <- read_csv("team_ratings.csv")
schedule <- read_csv("remaining_schedule.csv")
playoff_probs <- simulate_season(ratings, schedule)
print(playoff_probs)
Implementation in R
# Playoff probability simulation
library(tidyverse)
simulate_season <- function(team_ratings, remaining_schedule, n_sims = 1000) {
results <- map_dfr(1:n_sims, function(sim) {
schedule <- remaining_schedule %>%
mutate(
# Win probability based on ratings
home_wp = 1 / (1 + 10^((away_rating - home_rating - 3) / 10)),
home_win = runif(n()) < home_wp
)
# Aggregate wins
home_wins <- schedule %>%
group_by(home_team) %>%
summarise(wins = sum(home_win), .groups = "drop")
away_wins <- schedule %>%
group_by(away_team) %>%
summarise(wins = sum(!home_win), .groups = "drop")
# Combine
total_wins <- full_join(
home_wins %>% rename(team = home_team),
away_wins %>% rename(team = away_team),
by = "team"
) %>%
mutate(total_wins = coalesce(wins.x, 0) + coalesce(wins.y, 0)) %>%
select(team, total_wins) %>%
mutate(sim = sim)
})
# Playoff probability
results %>%
group_by(team) %>%
summarise(
avg_wins = mean(total_wins),
playoff_prob = mean(total_wins >= playoff_threshold) * 100,
.groups = "drop"
)
}
ratings <- read_csv("team_ratings.csv")
schedule <- read_csv("remaining_schedule.csv")
playoff_probs <- simulate_season(ratings, schedule)
print(playoff_probs)