Chapter 54 Advanced ~45 min read

Player Archetype Classification

Using clustering and classification to identify player types and archetypes.

Archetype Analysis

Players can be grouped into archetypes based on statistical profiles and playing styles. Rather than traditional position labels (point guard, center), archetypes capture what players actually do: shoot, create, defend, rebound. Understanding archetypes helps with roster construction, scheme fit, and player comparison.

Clustering Approaches

from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler

def cluster_player_archetypes(player_stats, n_clusters=8):
    """Identify player archetypes using K-means clustering"""
    features = ['PTS_per100', 'AST_per100', 'REB_per100',
                'STL_per100', 'BLK_per100', 'FG3A_per100',
                'USG_PCT', 'TS_PCT']

    X = player_stats[features]
    scaler = StandardScaler()
    X_scaled = scaler.fit_transform(X)

    kmeans = KMeans(n_clusters=n_clusters, random_state=42)
    player_stats['archetype'] = kmeans.fit_predict(X_scaled)

    return player_stats

Common NBA Archetypes

Analysis typically reveals archetypal patterns: scoring guards, playmaking point guards, 3-and-D wings, stretch bigs, rim-running centers, traditional post players, and various hybrid types. These archetypes reflect how the modern NBA values different skill combinations.

Roster Construction

Successful teams typically need certain archetype combinations: playmaking, shooting, rim protection, perimeter defense. Archetype analysis helps identify roster holes and trade targets that fill specific needs rather than just adding generic "talent."

Implementation in R

# Rest impact analysis
library(tidyverse)

analyze_rest_impact <- function(game_data) {
  game_data %>%
    group_by(rest_days) %>%
    summarise(
      games = n(),
      win_pct = mean(win),
      avg_pts = mean(pts),
      avg_margin = mean(margin),
      fg_pct = mean(fg_pct),
      .groups = "drop"
    )
}

games <- read_csv("game_rest_data.csv")
rest_impact <- analyze_rest_impact(games)

# Visualize rest impact
ggplot(rest_impact, aes(x = rest_days, y = win_pct)) +
  geom_bar(stat = "identity", fill = "#1d428a") +
  scale_y_continuous(labels = scales::percent) +
  labs(title = "Win Percentage by Days of Rest",
       x = "Days of Rest", y = "Win Percentage") +
  theme_minimal()

Implementation in R

# Rest impact analysis
library(tidyverse)

analyze_rest_impact <- function(game_data) {
  game_data %>%
    group_by(rest_days) %>%
    summarise(
      games = n(),
      win_pct = mean(win),
      avg_pts = mean(pts),
      avg_margin = mean(margin),
      fg_pct = mean(fg_pct),
      .groups = "drop"
    )
}

games <- read_csv("game_rest_data.csv")
rest_impact <- analyze_rest_impact(games)

# Visualize rest impact
ggplot(rest_impact, aes(x = rest_days, y = win_pct)) +
  geom_bar(stat = "identity", fill = "#1d428a") +
  scale_y_continuous(labels = scales::percent) +
  labs(title = "Win Percentage by Days of Rest",
       x = "Days of Rest", y = "Win Percentage") +
  theme_minimal()
Chapter Summary

You've completed Chapter 54: Player Archetype Classification.