The Aging Curve
Player performance follows characteristic patterns with age. Most metrics show improvement through the early-to-mid 20s, a peak period, and then gradual decline. Understanding these patterns enables better projections and contract valuation. Teams that over-pay for declining players or undervalue developing ones make systematic errors that analytics can address.
Skill-Specific Aging
Different skills age differently. Speed and athleticism peak early and decline relatively quickly. Shooting skills and basketball IQ may continue improving into the early 30s. Post players often age better than guards due to relying less on speed. Projections should apply skill-specific aging curves rather than assuming uniform decline.
Estimating Career Trajectories
Career trajectory modeling uses current performance, age, and historical comparisons to project future value. Young players with strong current performance have high expected career value; older players face limited remaining productive years regardless of current quality.
def estimate_remaining_value(current_war, age, typical_decline=0.15):
"""Estimate remaining career value based on age and current production"""
retirement_age = 38
remaining_years = max(0, retirement_age - age)
remaining_war = 0
current_level = current_war
for year in range(remaining_years):
remaining_war += current_level
current_level *= (1 - typical_decline) # Annual decline
if current_level < 0.5: # Below replacement
break
return round(remaining_war, 1)
Survivor Bias Considerations
Observed aging curves face survivor bias: we only see players who remained in the league at each age. Players who aged poorly exited the sample, making the remaining players look better than average. Proper aging analysis must account for this selection effect.
Implementation in R
# Pythagorean expectation
library(tidyverse)
calculate_expected_wins <- function(team_stats, exponent = 13.91) {
team_stats %>%
mutate(
# Pythagorean expectation
pyth_win_pct = pts_scored^exponent /
(pts_scored^exponent + pts_allowed^exponent),
expected_wins = round(pyth_win_pct * games, 1),
# Luck factor
actual_win_pct = wins / games,
luck = round((actual_win_pct - pyth_win_pct) * games, 1)
)
}
team_stats <- read_csv("team_season_stats.csv")
expected <- calculate_expected_wins(team_stats)
# Teams with most luck
lucky_teams <- expected %>%
arrange(desc(luck)) %>%
select(team_name, wins, expected_wins, luck)
print(lucky_teams)
Implementation in R
# Pythagorean expectation
library(tidyverse)
calculate_expected_wins <- function(team_stats, exponent = 13.91) {
team_stats %>%
mutate(
# Pythagorean expectation
pyth_win_pct = pts_scored^exponent /
(pts_scored^exponent + pts_allowed^exponent),
expected_wins = round(pyth_win_pct * games, 1),
# Luck factor
actual_win_pct = wins / games,
luck = round((actual_win_pct - pyth_win_pct) * games, 1)
)
}
team_stats <- read_csv("team_season_stats.csv")
expected <- calculate_expected_wins(team_stats)
# Teams with most luck
lucky_teams <- expected %>%
arrange(desc(luck)) %>%
select(team_name, wins, expected_wins, luck)
print(lucky_teams)