Teams as Networks
Basketball teams can be modeled as networks where players are nodes and interactions are edges. Pass networks show who passes to whom. Assist networks connect passers to scorers. Screen networks link screeners and ball handlers. Network analysis provides tools for understanding team dynamics beyond individual statistics.
Network Metrics
import networkx as nx
def build_pass_network(play_data):
"""Build player network from passing data"""
G = nx.DiGraph()
for _, play in play_data.iterrows():
passer = play['PASSER_ID']
receiver = play['RECEIVER_ID']
if G.has_edge(passer, receiver):
G[passer][receiver]['weight'] += 1
else:
G.add_edge(passer, receiver, weight=1)
# Calculate network metrics
metrics = {
'centralization': nx.degree_centrality(G),
'clustering': nx.clustering(G.to_undirected()),
'density': nx.density(G)
}
return G, metrics
Ball Movement Patterns
Network analysis reveals ball movement patterns that distinguish offense styles. Highly centralized networks run through one player; distributed networks share the ball widely. Research suggests more equal ball distribution correlates with offensive efficiency.
Chemistry Detection
Player pairs who combine well show distinctive network properties: strong connections, successful outcomes on shared actions. Network analysis can identify which player combinations have positive chemistry beyond what individual statistics would predict.
Implementation in R
# Causal inference: Treatment effect estimation
library(tidyverse)
library(MatchIt)
estimate_treatment_effect <- function(player_data) {
# Example: Effect of playing with star player on role player performance
# Propensity score matching
matched <- matchit(
with_star ~ age + experience + prior_pts + prior_ws,
data = player_data,
method = "nearest",
ratio = 1
)
matched_data <- match.data(matched)
# Estimate treatment effect
treatment_effect <- matched_data %>%
group_by(with_star) %>%
summarise(
n = n(),
avg_pts = mean(pts),
avg_ws = mean(ws),
.groups = "drop"
)
# Calculate difference
effect <- treatment_effect$avg_ws[treatment_effect$with_star == 1] -
treatment_effect$avg_ws[treatment_effect$with_star == 0]
list(matched = matched, effect = effect, summary = treatment_effect)
}
players <- read_csv("role_player_data.csv")
result <- estimate_treatment_effect(players)
print(paste("Treatment Effect:", round(result$effect, 2), "Win Shares"))
Implementation in Python
# Causal inference: Difference-in-differences
import pandas as pd
import statsmodels.formula.api as smf
def difference_in_differences(player_data):
"""
Estimate effect of rule change on player performance
"""
# DiD regression
model = smf.ols(
"performance ~ treatment * post_period + age + experience",
data=player_data
).fit()
# Treatment effect is the interaction coefficient
treatment_effect = model.params["treatment:post_period"]
return {
"model": model,
"treatment_effect": treatment_effect,
"p_value": model.pvalues["treatment:post_period"]
}
player_data = pd.read_csv("player_rule_change_data.csv")
result = difference_in_differences(player_data)
print(f"Treatment Effect: {result['treatment_effect']:.3f}")
Implementation in R
# Causal inference: Treatment effect estimation
library(tidyverse)
library(MatchIt)
estimate_treatment_effect <- function(player_data) {
# Example: Effect of playing with star player on role player performance
# Propensity score matching
matched <- matchit(
with_star ~ age + experience + prior_pts + prior_ws,
data = player_data,
method = "nearest",
ratio = 1
)
matched_data <- match.data(matched)
# Estimate treatment effect
treatment_effect <- matched_data %>%
group_by(with_star) %>%
summarise(
n = n(),
avg_pts = mean(pts),
avg_ws = mean(ws),
.groups = "drop"
)
# Calculate difference
effect <- treatment_effect$avg_ws[treatment_effect$with_star == 1] -
treatment_effect$avg_ws[treatment_effect$with_star == 0]
list(matched = matched, effect = effect, summary = treatment_effect)
}
players <- read_csv("role_player_data.csv")
result <- estimate_treatment_effect(players)
print(paste("Treatment Effect:", round(result$effect, 2), "Win Shares"))
Implementation in Python
# Causal inference: Difference-in-differences
import pandas as pd
import statsmodels.formula.api as smf
def difference_in_differences(player_data):
"""
Estimate effect of rule change on player performance
"""
# DiD regression
model = smf.ols(
"performance ~ treatment * post_period + age + experience",
data=player_data
).fit()
# Treatment effect is the interaction coefficient
treatment_effect = model.params["treatment:post_period"]
return {
"model": model,
"treatment_effect": treatment_effect,
"p_value": model.pvalues["treatment:post_period"]
}
player_data = pd.read_csv("player_rule_change_data.csv")
result = difference_in_differences(player_data)
print(f"Treatment Effect: {result['treatment_effect']:.3f}")