A Quantitative Analysis of Deuces in Tennis

Austin Rochford

2024-07-30

To celebrate fourteen years(!) of sporadically writing here and break a bit of a recent dry spell, I thought it would be fun to revisit this blog's first post on probability theory and deuces and tennis. Specifically, this post will investigate whether or not historical match data aligns with the theoretical results drived in that post.

That post summed an infinite series to show that if the probability a server wins a single point is $p$, the probability that they win a game from deuce is

$$P(\text{Server wins from deuce}\ |\ p) = \frac{p^2}{1 - 2p \cdot (1 - p)}.$$

The plot below contrasts this function with the line $y = x$, to show how the deuce structure of tennis advantages better players more than one might expect.

In [1]:

%matplotlib inline

In [2]:

from fastprogress.fastprogress import progress_bar
from itertools import product
from urllib.parse import urljoin
from warnings import filterwarnings

In [3]:

from matplotlib import ticker, pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

In [4]:

filterwarnings("ignore", category=pd.errors.DtypeWarning)
filterwarnings("ignore", category=pd.errors.SettingWithCopyWarning)

In [5]:

sns.set(color_codes=True)

pct_formatter = ticker.StrMethodFormatter("{x:.1%}")

In [6]:

def theory_win_prob(p):
    return p**2 / (1 - 2 * p * (1 - p))

In [7]:

fig, ax = plt.subplots(figsize=(6, 6))
ax.set_aspect("equal");

p_grid = np.linspace(0, 1, 50)

ax.plot(p_grid, theory_win_prob(p_grid), c="k", label="Theory");
ax.axline((0, 0), slope=1, c="k", ls="--", label="Equality");

ax.set_xlim(0, 1);
ax.xaxis.set_major_formatter(pct_formatter);
ax.xaxis.set_major_locator(ticker.MultipleLocator(0.25));
ax.set_xlabel("Probability server wins a point");

ax.set_ylim(0, 1);
ax.yaxis.set_major_formatter(pct_formatter);
ax.yaxis.set_major_locator(ticker.MultipleLocator(0.25));
ax.set_ylabel("Probability server wins\ngame from deuce");

ax.legend();

No description has been provided for this image

As a former calculus teacher, this derivation warms hy heart; as a statistician, it piques my curiosity.

Load the data¶

To see how well this theory holds up in reality, we load point-by-point data from singles matches in most major tournaments since 2011. Many thanks to Jeff Slackmann of Tennis Abstract for making this data available under a CC BY-NC-SA 4.0 license on GitHub.

In [8]:

YEARS = range(2011, 2024)
TOURNAMENTS = ["ausopen", "frenchopen", "usopen", "wimbledon"]

In [9]:

DATA_PATH_BASE = "https://raw.githubusercontent.com/JeffSackmann/tennis_slam_pointbypoint/master/"

def get_data_path(year, tournament):
    return urljoin(DATA_PATH_BASE, f"{year}-{tournament}-points.csv")

In [10]:

EXCLUDE = {
    # wimbledon was cancelled in 2020 due to covid
    # see https://en.wikipedia.org/wiki/2020_Wimbledon_Championships
    (2020, "wimbledon"),

    # there is no australian or french open data for 2022 and 2023
    (2022, "ausopen"),
    (2022, "frenchopen"),
    (2023, "ausopen"),
    (2023, "frenchopen")
}

In [11]:

data_paths = [
    get_data_path(year, tournament) for year, tournament in product(YEARS, TOURNAMENTS)
        if (year, tournament) not in EXCLUDE
]

In [12]:

DTYPES = {
    "match_id": str,
    "SetNo": int,
    "GameNo": int,
    "P1Score": str,
    "P2Score": str,
    "PointServer": int,
    "PointWinner": int,
    "GameWinner": int
}

INDEX = [
    "match_id",
    "set_no",
    "game_no"
]

In [13]:

def to_snake_case(camel_case):
    chars = []
    
    if camel_case:
        fst, *rest = camel_case
        chars.append(fst.lower())

        for char in rest:
            if char.isupper():
                chars.append("_")

            chars.append(char.lower())

    return "".join(chars)

In [14]:

point_dfs = [pd.read_csv(path, usecols=DTYPES.keys()) for path in progress_bar(data_paths)]

point_df = (
    pd.concat(point_dfs, ignore_index=True)
      .dropna()
      .astype(DTYPES)
      .rename(columns=to_snake_case)
      .set_index(INDEX)
      .query("point_winner != 0")
)

100.00% [47/47 00:27<00:00]

In [15]:

point_df

Out[15]:

			game_winner	point_winner	point_server	p1_score	p2_score
match_id	set_no	game_no
2011-ausopen-1101	1	1	0	1	2	15	0
		1	0	1	2	30	0
		1	0	1	2	40	0
		1	1	1	2	0	0
		2	0	1	1	15	0
...	...	...	...	...	...	...	...
2023-wimbledon-2701	2	10	0	1	1	15	0
		10	0	1	1	30	0
		10	0	1	1	40	0
		10	0	2	1	40	15
		10	1	1	1	0	0

1449025 rows × 5 columns

There is information on more than 1.4 million points, which should give us plenty of data to compare to our theoretical calculations.

The original data set contains many more columns than we have loaded here, which are only those relevant for the task at hand. These columns are:

match_id - a unique identifier for the match
set_no - the number of the set within the match
game_no - the number of the game within the match
game_winner - if this point decided the current game, which player won the game, otherwise zero
point_winner - the player that won the point
p1_score - player one's score at the beginning of the point
p2_score - player two's score at the beginning of the point

Analysis¶

Probability server wins the point¶

First, we use this data to calculate the probability a player wins a point that they serve.

In [16]:

point_df["server_won_point"] = point_df["point_server"] == point_df["point_winner"]

In [17]:

point_df["server_won_point"].value_counts(normalize=True)

Out[17]:

server_won_point
True     0.614043
False    0.385957
Name: proportion, dtype: float64

As expected, we see that serving is a significant advantage, with players winning more than 60% of points that they serve.

A quick pivot table gives the probability that each player wins a point that they serve for each match in the data.

In [18]:

server_point_prob = point_df.pivot_table(
    index="match_id",
    values="server_won_point",
    columns="point_server"
)

In [19]:

server_point_prob

Out[19]:

point_server	1	2
match_id
2011-ausopen-1101	0.714286	0.076923
2011-ausopen-1103	0.717391	0.597826
2011-ausopen-1105	0.716981	0.652174
2011-ausopen-1108	0.520000	0.746269
2011-ausopen-1112	0.553571	0.651376
...	...	...
2023-wimbledon-2503	0.653465	0.580247
2023-wimbledon-2504	0.514706	0.650794
2023-wimbledon-2601	0.450000	0.596774
2023-wimbledon-2602	0.702128	0.633333
2023-wimbledon-2701	0.553846	0.466667

7981 rows × 2 columns

Note that since we have grouped by match_id, these probabilities are implicitly conditioned on both the server and their opponent, which is appropriate, as it implicitly accounts for the skill difference between the two players in question.

Identifying games that reach deuce¶

With these probabilities in hand, we turn to identifying games with deuces and their winners. Recall that a deuce is reached when the score in a game is 40-40. We first identify deuce points and reduce to games that reach deuce.

In [20]:

point_df["is_deuce"] = (point_df["p1_score"] == "40") & (point_df["p2_score"] == "40")

In [21]:

deuce_df = (
    point_df.groupby(INDEX)
            [["game_winner", "point_server", "is_deuce"]]
            .max()
            .rename(columns={
                "game_winner": "winner",
                "point_server": "server",
                "is_deuce": "has_deuce"
            })
            .query("has_deuce")
)

In [22]:

deuce_df

Out[22]:

			winner	server	has_deuce
match_id	set_no	game_no
2011-ausopen-1101	2	2	1	1	True
2011-ausopen-1103	1	8	2	2	True
	2	3	1	2	True
	3	9	2	2	True
	3	10	1	1	True
...	...	...	...	...	...
2023-wimbledon-2602	3	6	1	2	True
2023-wimbledon-2602	3	8	2	2	True
2023-wimbledon-2701	1	3	1	2	True
	1	4	1	1	True
	2	2	2	1	True

55107 rows × 3 columns

We see that more than 55,000 games reach deuce, providing plenty of data to compare to the theoritcal results.

Comparing theory and reality¶

The final step to (in)validating the theoretical result is to join the deuce data to the server point probabilities to produce the probability that the server of each deuce wins a point.

In [23]:

aligned_server_point_prob, _ = server_point_prob.align(deuce_df, axis=0)

deuce_df["server_point_prob"] = aligned_server_point_prob[1].where(
    deuce_df["server"] == 1,
    aligned_server_point_prob[2]
)

In [24]:

deuce_df["server_won"] = deuce_df["server"] == deuce_df["winner"]

In [25]:

deuce_df

Out[25]:

			winner	server	has_deuce	server_point_prob	server_won
match_id	set_no	game_no
2011-ausopen-1101	2	2	1	1	True	0.714286	True
2011-ausopen-1103	1	8	2	2	True	0.597826	True
	2	3	1	2	True	0.597826	False
	3	9	2	2	True	0.597826	True
	3	10	1	1	True	0.717391	True
...	...	...	...	...	...	...	...
2023-wimbledon-2602	3	6	1	2	True	0.633333	False
2023-wimbledon-2602	3	8	2	2	True	0.633333	True
2023-wimbledon-2701	1	3	1	2	True	0.466667	False
	1	4	1	1	True	0.553846	True
	2	2	2	1	True	0.553846	False

55107 rows × 5 columns

To compare this data to the theoretical result, we group the deuces into fifty bins based on the probability that the server wins a point, each with an equal number of samples, and calculate the empirical rate at which the server won the deuce game within each bin.

In [26]:

N_BIN = 50

In [27]:

deuce_df["binned_server_point_prob"] = (
    pd.qcut(deuce_df["server_point_prob"], N_BIN)
      .apply(lambda interval: interval.mid)
)

In [28]:

test_df = (
    deuce_df.groupby("binned_server_point_prob")
            ["server_won"]
            .mean()
            .reset_index()
)

In [29]:

fig, ax = plt.subplots(figsize=(6, 6))
ax.set_aspect("equal");

ax.plot(p_grid, theory_win_prob(p_grid),
        c="k", ls="--", label="Theory");

test_df.plot(
    kind="scatter", x="binned_server_point_prob", y="server_won",
    zorder=5, alpha=0.75, label="Tournament data",
    ax=ax
);

ax.set_xlim(0, 1);
ax.xaxis.set_major_formatter(pct_formatter);
ax.xaxis.set_major_locator(ticker.MultipleLocator(0.25));
ax.set_xlabel("Probability server wins a point");

ax.set_ylim(0, 1);
ax.yaxis.set_major_formatter(pct_formatter);
ax.yaxis.set_major_locator(ticker.MultipleLocator(0.25));
ax.set_ylabel("Probability server wins\ngame from deuce");

ax.legend();

Except for the very first bin, we see quite remarkable agreement between the empirical tournament data and the theoretical result from the prior post. I always enjoy when simple mathematical models map so cleanly to reality!

Note that this analysis could be made more rigorous by only using points that occur in the match prior to the deuce in question to calculate the probability that the server wins a point they serve. For the purposes of this post, I am satisfied using all points in a match to calculate this probability, even though it is leaking some information from the future into the analysis of each deuce.

This post is available as a Jupyter notebook here.

In [30]:

%load_ext watermark
%watermark -n -u -v -iv

Last updated: Tue Jul 30 2024

Python implementation: CPython
Python version       : 3.10.10
IPython version      : 8.11.0

pandas    : 2.0.3
matplotlib: 3.7.1
numpy     : 1.24.2
seaborn   : 0.12.2