Understanding NBA Foul Calls with Python¶

PyData NYC • Nov 27, 2017 • @AustinRochford ¶

About Me¶

Principal Data Scientist @ Monetate Labs • PyMC3 contributor¶

@AustinRochford • austinrochford.com • github.com/AustinRochford ¶

austin.rochford@gmail.com • arochford@monetate.com ¶

About this Talk¶

Last Two Minute Report¶

Scraping the data¶

Loading the data¶

In [12]:

orig_df.head(n=2).T

Out[12]:

play_id	20150301CLEHOU-0	20150301CLEHOU-1
period	Q4	Q4
seconds_left	112	103
call_type	Foul: Shooting	Foul: Shooting
committing_player	Josh Smith	J.R. Smith
disadvantaged_player	Kevin Love	James Harden
review_decision	CNC	CC
away	CLE	CLE
home	HOU	HOU
date	2015-03-01 00:00:00	2015-03-01 00:00:00
score_away	103	103
score_home	105	105
disadvantaged_team	CLE	HOU
committing_team	HOU	CLE

Research questions¶

How does game context impact foul calls?
Is (not) committing and/or drawing fouls a measurable player skill?

Exploratory Data Analysis¶

In [14]:

(orig_df.call_type
        .str.split(':', expand=True).iloc[:, 0]
        .value_counts()
        .plot(kind='bar', color=blue, logy=True, title="Call types")
        .set_ylabel("Frequency"));

In [16]:

(foul_df.call_type
        .str.split(': ', expand=True).iloc[:, 1]
        .value_counts()
        .plot(kind='bar', color=blue, logy=True, title="Foul Types")
        .set_ylabel("Frequency"));

Data transformation¶

In [25]:

df.head(n=2).T

Out[25]:

play_id	20151028INDTOR-1	20151028INDTOR-2
seconds_left	89.0	73.0
call_type	4.0	4.0
foul_called	1.0	0.0
player_committing	162.0	36.0
player_disadvantaged	98.0	358.0
score_committing	99.0	106.0
score_disadvantaged	106.0	99.0
season	0.0	0.0

Modeling¶

George Box (via Dustin Tran):

Build a model of the science
Infer the model given data
Criticize the model given data

Build a model of the science¶

In [27]:

make_foul_rate_yaxis(
    df.pivot_table('foul_called', 'season')
      .rename(index=season_enc.inverse_transform)
      .rename_axis("Season")
      .plot(kind='bar', rot=0, legend=False)
);

Each season has a different foul call rate

In [28]:

import pymc3 as pm

with pm.Model() as base_model:
    β_season = pm.Normal('β_season', 0., 5., shape=n_season)
    p = pm.Deterministic('p', pm.math.sigmoid(β_season))

Foul calls are like flipping a weighted coin

In [30]:

with base_model:
    y = pm.Bernoulli(
        'y', p[season],
        observed=df.foul_called.values
    )

Infer the model given data¶

In [32]:

with base_model:
    base_trace = pm.sample(**SAMPLE_KWARGS)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
100%|██████████| 1500/1500 [00:06<00:00, 235.48it/s]

Convergence diagnostics¶

The folk theorem [of statistical computing] is this: When you have computational problems, often there’s a problem with your model.

— Andrew Gelman

In [35]:

(pm.energyplot(base_trace, legend=False, figsize=(6, 4))
   .set_title(CONVERGENCE_TITLE()));

Criticize the model given data¶

In [38]:

resid_df = (df.assign(p_hat=base_trace['p'][:, df.season].mean(axis=0))
              .assign(resid=lambda df: df.foul_called - df.p_hat))

In [39]:

resid_df[['foul_called', 'p_hat', 'resid']].head()

Out[39]:

	foul_called	p_hat	resid
play_id
20151028INDTOR-1	1.0	0.403732	0.596268
20151028INDTOR-2	0.0	0.403732	-0.403732
20151028INDTOR-3	1.0	0.403732	0.596268
20151028INDTOR-4	0.0	0.403732	-0.403732
20151028INDTOR-6	0.0	0.403732	-0.403732

In [40]:

(resid_df.pivot_table('resid', 'season')
         .rename(index=season_enc.inverse_transform))

Out[40]:

	resid
season
2015-2016	-0.000019
2016-2017	0.000026

Intentional fouls¶

In [42]:

make_time_axes(
    resid_df.pivot_table('resid', 'seconds_left')
            .reset_index()
            .plot('seconds_left', 'resid', kind='scatter'),
    ylabel="Residual"
);

Build a model of the science, take two¶

In [44]:

make_time_axes(
    df.pivot_table('foul_called', 'seconds_left', 'trailing_committing')
      .rolling(20).mean()
      .rename(columns={0: "No", 1: "Yes"})
      .rename_axis("Committing team is trailing", axis=1)
      .plot()
);

In [45]:

ax = (df.pivot_table('foul_called', 'call_type')
        .rename(index=call_type_enc.inverse_transform)
        .rename_axis("Call type", axis=0)
        .plot(kind='barh', legend=False))
ax.xaxis.set_major_formatter(pct_formatter);
ax.set_xlabel("Observed foul call rate");

Each call type has a different foul call rate

In [48]:

with poss_model:
    β_call = hierarchical_normal('β_call', n_call_type)

In [51]:

make_time_axes(
    df.pivot_table('foul_called', 'seconds_left', 'trailing_poss')
      .loc[:, 1:3]
      .rolling(20).mean()
      .rename_axis(
          "Trailing possessions\n(committing team)",
          axis=1
      )
      .plot()
);

In [52]:

make_time_axes(
    df.pivot_table('foul_called', 'seconds_left', 'call_type')
      .rolling(20).mean()
      .rename(columns=call_type_enc.inverse_transform)
      .rename_axis(None, axis=1)
      .plot()
);

The shot clock¶

In [55]:

ax = sns.heatmap(
    df.pivot_table('foul_called', 'trailing_poss', 'remaining_poss')
      .rename_axis(
          "Trailing possessions\n(committing team)", axis=0
      )
      .rename_axis("Remaining possessions", axis=1),
    cmap='seismic', cbar_kws={'format': pct_formatter}
)
ax.invert_yaxis();
ax.set_title("Observed foul call rate");

Difference from call type average¶

In [58]:

(sns.FacetGrid(diff_df, col='call_type', col_wrap=3, aspect=1.5)
    .map_dataframe(plot_foul_diff_heatmap)
    .set_axis_labels(
        "Remaining possessions",
        "Trailing possessions\n(committing team)"
    )
    .set_titles("{col_name}"));

The foul call rate changes based on the number of possessions trailing and remaining and the call type

In [59]:

with poss_model:
    β_poss = hierarchical_normal(
        'β_poss',
        (n_trailing_poss, n_remaining_poss, n_call_type),
        σ_shape=(1, 1, n_call_type)
    )

The foul call rate is a combination of season, call type, and possession factors

In [61]:

with poss_model:
    η_game = β_season[season] \
                + β_call[call_type] \
                + β_poss[
                    trailing_poss, remaining_poss, call_type
                ]

Infer the model given data, take two¶

In [63]:

with poss_model:
    poss_trace = pm.sample(**SAMPLE_KWARGS)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
100%|██████████| 1500/1500 [06:34<00:00,  3.80it/s]

Criticize the model given data, take two¶

In [67]:

ax = sns.heatmap(
    resid_df.pivot_table('resid', 'trailing_poss', 'remaining_poss')
      .rename_axis("Trailing possessions\n(committing team)", axis=0)
      .rename_axis("Remaining possessions", axis=1)
      .loc[-3:3],
    cmap='seismic',  cbar_kws={'format': pct_formatter}
)
ax.invert_yaxis();
ax.set_title("Observed foul call rate");

In [69]:

ax = (resid_df.groupby(bins[bin_ix])
              .resid.mean()
              .rename_axis('p_hat', axis=0)
              .reset_index()
              .plot('p_hat', 'resid', kind='scatter'))

ax.xaxis.set_major_formatter(pct_formatter);
ax.set_xlabel(r"Binned $\hat{p}$");

make_foul_rate_yaxis(ax, label="Residual");

In [70]:

ax = (resid_df.groupby('seconds_left')
              .resid.mean()
              .reset_index()
              .plot('seconds_left', 'resid', kind='scatter'))
make_time_axes(ax, ylabel="Residual");

Model selection¶

In [72]:

comp_df = (pm.compare(
                (base_trace, poss_trace),
                (base_model, poss_model)
             )
             .rename(index=MODEL_NAME_MAP)
             .loc[MODEL_NAME_MAP.values()])

In [73]:

comp_df

Out[73]:

	WAIC	pWAIC	dWAIC	weight	SE	dSE	warning
Base	11609.8	1.98	1543.34	0	56.98	73.35	1
Possession	10066.5	81.92	0	1	87.93	0	1

In [74]:

fig, ax = plt.subplots()
ax.errorbar(
    np.arange(len(MODEL_NAME_MAP)), comp_df.WAIC,
    yerr=comp_df.SE, fmt='o'
);
ax.set_xticks(np.arange(len(MODEL_NAME_MAP)));
ax.set_xticklabels(comp_df.index);
ax.set_xlabel("Model");
ax.set_ylabel("WAIC");

Research questions¶

~~How does game context impact foul calls?~~
Is (not) committing and/or drawing fouls a measurable player skill?

Build a model of the science, take three¶

Item-response theory¶

In [76]:

fig

Out[76]:

Each disadvantaged player has an ideal point (per season)

In [79]:

with irt_model:
    θ_player = hierarchical_normal(
        'θ_player', (n_player, n_season)
    )
    θ = θ_player[player_disadvantaged, season]

Each committing player has an ideal point (per season)

In [80]:

with irt_model:
    b_player = hierarchical_normal(
        'b_player', (n_player, n_season)
    )
    b = b_player[player_committing, season]

Players affect the foul call rate through the difference in their ideal points

In [81]:

with irt_model:
    η_player = θ - b

Game and player effects combine to determine the foul call rate

In [82]:

with irt_model:
    η = η_game + η_player

Infer the model given data, take three¶

In [84]:

with irt_model:
    irt_trace = pm.sample(**SAMPLE_KWARGS)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
100%|██████████| 1500/1500 [08:45<00:00,  2.85it/s]

Criticize the model given data, take three¶

In [89]:

ax = (resid_df.groupby(bins[bin_ix])
              .resid.mean()
              .rename_axis('p_hat', axis=0)
              .reset_index()
              .plot('p_hat', 'resid', kind='scatter'))
ax.xaxis.set_major_formatter(pct_formatter);
ax.set_xlabel(r"Binned $\hat{p}$");
make_foul_rate_yaxis(ax, label="Residual");

Model selection¶

In [93]:

fig, ax = plt.subplots()
ax.errorbar(
    np.arange(len(MODEL_NAME_MAP)), comp_df.WAIC,
    yerr=comp_df.SE, fmt='o'
);
ax.set_xticks(np.arange(len(MODEL_NAME_MAP)));
ax.set_xticklabels(comp_df.index);
ax.set_xlabel("Model");
ax.set_ylabel("WAIC");

Is committing and/or drawing fouls a measurable player skill?¶

In [101]:

fig

Out[101]:

In [104]:

fig, ax = plot_latent_params(
    top_bot_irt_df[top_bot_irt_df.param == 'θ']
                  .sort_values('mean')
)
ax.set_xlabel(r"$\hat{\theta}$");
ax.set_title("Top and bottom ten");

In [105]:

fig, ax = plot_latent_params(
    top_bot_irt_df[top_bot_irt_df.param == 'b']
                  .sort_values('mean')
)
ax.set_xlabel(r"$\hat{b}$");
ax.set_title("Top and bottom ten");

In [106]:

player_irt_df.head()

Out[106]:

param	b		θ
season	0	1	0	1
player
0	-0.064689	0.006621	-0.015132	-0.003946
1	-0.006142	0.004387	0.010513	0.037012
2	0.090107	0.092868	-0.034668	-0.003519
3	-0.022949	-0.002456	0.005822	-0.003961
4	-0.006999	0.288897	0.068338	0.007508

Year-over-year consistency (players that appeared in all seasons)¶

In [110]:

style_grid(sns.PairGrid(player_irt_df.loc[player_all_season], size=1.75)
              .map_upper(plt.scatter, alpha=0.5)
              .map_diag(plt.hist)
              .map_lower(plot_corr));

Year-over-year consistency (players that appeared at least ten times in all seasons)¶

In [113]:

grid.fig

Out[113]:

Research questions¶

~~How does game context impact foul calls?~~
~~Is (not) committing and/or drawing fouls a measurable player skill?~~

Visualize the Predictions¶

In [116]:

import dash
import dash_html_components as html

In [129]:

app = dash.Dash()

app.layout = html.Div(children=[
    html.H1(
        "Understanding NBA Foul Calls with Python"
    ),
    html.Center(TABLE)
])

In [130]:

@app.callback(
    dash.dependencies.Output(
        'irt-param-scatter', 'figure'
    ),
    [
        dash.dependencies.Input(
            'season-dropdown','value'
        ),
        dash.dependencies.Input(
            'player-committing-dropdown','value'
        ),
        dash.dependencies.Input(
            'player-disadvantaged-dropdown','value'
        )
    ]
)
def update_irt_scatter(season_,
                       player_committing_,
                       player_disadvantaged_):
    irt_df = get_irt_df(
        player_committing_=player_committing_,
        player_disadvantaged_=player_disadvantaged_,
        season_=season_,
    )

    return get_irt_scatter(irt_df)

Understanding NBA Foul Calls with Python¶

PyData NYC • Nov 27, 2017 • @AustinRochford¶

About Me¶

Principal Data Scientist @ Monetate Labs • PyMC3 contributor¶

@AustinRochford • austinrochford.com • github.com/AustinRochford¶

austin.rochford@gmail.com • arochford@monetate.com¶

About this Talk¶

Last Two Minute Report¶

Scraping the data¶

Loading the data¶

Research questions¶

Exploratory Data Analysis¶

Data transformation¶

Modeling¶

Build a model of the science¶

Infer the model given data¶

Convergence diagnostics¶

Criticize the model given data¶

Intentional fouls¶

Build a model of the science, take two¶

The shot clock¶

Difference from call type average¶

Infer the model given data, take two¶

Criticize the model given data, take two¶

Model selection¶

Research questions¶

Build a model of the science, take three¶

Item-response theory¶

Infer the model given data, take three¶

Criticize the model given data, take three¶

Model selection¶

Is committing and/or drawing fouls a measurable player skill?¶

Year-over-year consistency (players that appeared in all seasons)¶

Year-over-year consistency (players that appeared at least ten times in all seasons)¶

Research questions¶

Visualize the Predictions¶

Thank You!¶

@AustinRochford • austinrochford.com • github.com/AustinRochford¶

austin.rochford@gmail.com • arochford@monetate.com¶

PyData NYC • Nov 27, 2017 • @AustinRochford ¶

@AustinRochford • austinrochford.com • github.com/AustinRochford ¶

austin.rochford@gmail.com • arochford@monetate.com ¶

@AustinRochford • austinrochford.com • github.com/AustinRochford ¶

austin.rochford@gmail.com • arochford@monetate.com ¶