The Math of Rolling with Advantage; What is the average maximum of several dice rolls? - Mihai Nica's Notes

In the world of tabletop role-playing games like Dungeons & Dragons (D&D), the concept of "rolling with Advantage" is a popular mechanic. This works by rolling a 20-sided die \*twice\* and taking the maximum of the two rolls, giving players a better chance of a high number. But how much does this improve the average roll? In this article, we explore the mathematics behind this question and its generalization to arbitrary collections of dice and prove a conjecture originally posed by YouTuber Stand-up Maths. Along the way, we'll see connections to the birthday paradox and links to the classical sum of powers formulas, known as Faulhaber's formula too. There is also a video version, which this article is based on below. ![](https://www.youtube.com/watch?v=0d5gCzxD0sg) # 1\. Rolling with Advantage and Stand-up Maths Conjecture When you roll a 20-sided die (a d20) once, the average outcome is straightforward to calculate: $\text{Average roll} = \frac{1 + 2 + \cdots + 20}{20} = 10.5$ However, in D&D, rolling with Advantage means rolling two d20 dice and taking the maximum result. Intuitively, this should increase the average roll, but by how much exactly? It turns out that the average maximum of two rolls of a 20-sided die is exactly 13.825, about 3.3 points higher than the average of a single roll. (We'll see a few different ways to get this exact answer later on.) $\text{Average maximum of two 20-sided dice} = 13.825$ More generally, the problem can be framed as follows: Given an $n$-sided die rolled $m$ times, what is the expected value of the maximum roll? This leads to the question of how the expected maximum depends on $n$ (the number of sides) and $m$ (the number of rolls). Popular YouTuber Stand-up Maths, through computational experiments and partial sums, conjectured a formula for this expected maximum. The conjecture states: $\mathbb{E}[\max(X_1, \ldots, X_m)] \approx \frac{m}{m+1} n + \frac{1}{2}$ where each $X_i$ are $m$ independent rolls of an $n$-sided die. For the classic D&D case where $m=2$ and $n=20$, this formula gives: $\frac{2}{3} \times 20 + \frac{1}{2} = 13.8333\ldots$ which is remarkably close to the true average of 13.825, an overestimate by only about 0.01. ![[RollingWithAdvantageScreenshot_001.png]] # 2. Graphs of Approximation Error Sizes This approximation seems to work, but how accurate is it across different values of $n$ and $m$? To answer this, we examine graphs of the error size of the approximation compared to the true expected maximum. These graphs plotting the approximation error reveal that: * For $m=2$ (rolling two dice), the error decreases as $n$ increases, approaching zero for large $n$. For a 20-sided die, the error is about 0.01, confirming the high accuracy of the formula. * For larger $m$ values (e.g., $m=7$), the error grows larger, around 0.04 for $n=20$, indicating the approximation becomes less precise when rolling many dice ![[RollingWithAdvantageScreenshot_003.png]] To improve the approximation, we will add on an additional third term was introduced to the approximation: $\mathbb{E}[\max(X_1, \ldots, X_m)] \approx \frac{m}{m+1} n + \frac{1}{2} - \frac{m}{12n}$ Including this $-\frac{m}{12n}$ term dramatically reduces the error. These are shown as the 3-term approximation sold lines in the graph. Zoomed-in graphs show errors shrinking to the order of $10^{-5}$, making the approximation phenomenally accurate. There is also exactly zero error for the case of two dice $m=2$ or three dice $m=3$; the formula is exactly correct in those case. Plugging in $m=2$ and $n=20$ gives the classic DnD answer of 13.825 we stated earlier. ![[RollingWithAdvantageScreenshot_005.png]] # 3\. Probability Ninja Proof and High Level Intuition for Each Term Instead of relying solely on algebraic formulas and integrals, we can gain deep intuition and a neat proof of the conjecture by using what I call a "probability ninja proof." This approach uses symmetry and clever manipulation of random variables to uncover the result almost effortlessly. Each term ends up having a nice intuition. ## 3.1 The First Term: m/m+1 n: Symmetry This term ends up coming by symmetry between dice rolls. We will end up having $m+1$ symmetric pieces $S_1, S_2, \ldots, S_{m+1}$ which divide up an interval of length $1$, and then we will choose exactly $m$ of them. See Section 5.1 for details. ![[RollingWithAdvantageScreenshot_007.png]] ## 3.2 The Second Term: +1/2 The second term accounts for the fact that the dice roll is discrete, and we will be using a continuous uniform approximation ignores the rounding to integers. To refine this, we consider the effect of rounding up (denoted in math with $\lceil x \rceil$), and this leads to an extra factor of $+\frac{1}{2}$. See Section 5.2 for details. This rounding trick is also useful if you want to simulate dice rolls using an ordinary calculator (which has button for generating samples of $U$) ![[RollingWithAdvantageScreenshot_009.png]] ## 3.3 The Third Term: -m/12n The third term arises from the probability that two or more dice rolls tie for the maximum value. This subtlety is connected to the famous birthday paradox in probability, which calculates the likelihood of coincidences in a set of randomly chosen items. # 4\. A Tangent: Expected Minimum of Dice Rolls and Faulhaber's Sum of Powers Formula Before the proof of the main results, its worth pointing out a connection to the minimum of dice rolls and the sum of powers formula. We use the following symmetry property of $n$ sided dice rolls: $X \overset{d}{=} n + 1 - X$ where $\overset{d}{=}$ denotes equality in distribution (this is often seen in real life dice in the fact that opposite sides of a die sum to $n+1$). Therefore, the expected minimum can be expressed as: $\mathbb{E}[\min(X_1, \ldots, X_m)] = (n+1) - \mathbb{E}[\max(X_1, \ldots, X_m)]$ Using our approximation for the maximum, we get from that the minimum is approximately: $\mathbb{E}[\min(X_1, \ldots, X_m)] \approx \frac{1}{n+1} n + \frac{1}{2} + \frac{m}{12n}$ On the other hand, there is a simple proof using the so-called Darth Vader rule in probability, which lets us write the expected minimum as exactly equal to the sum of powers (There is a [short separate video](https://www.youtube.com/watch?v=AE1exYhVtP4) that shows why this is true): $ \mathbb{E}[\min \{X_1, X_2, \ldots, X_m\}] = \frac{1^m + 2^m + \ldots + (n-1)^m + n^m}{n^m} $ Combining these, we have a proof of an approximation to the sum of powers. These are the first three terms of [Faulhaber's formula](https://en.wikipedia.org/wiki/Faulhaber%27s_formula), which involves the Bernoulli numbers. ![[RollingWithAdvantageScreenshot_011.png]] # 5\. Detailed Derivations of the Approximation Terms ## 5.1 One Term Approximation Let $X$ be a roll of an $n$-sided die, taking values $1, 2, \ldots, n$ with equal probability $1/n$. We approximate $X$ by a continuous uniform variable scaled by $n$: $X \approx n U, \quad U \sim \text{Uniform}(0,1)$ Because the maximum operator is linear with respect to scaling, and expectations are also linear, we can pull out the factor of $n$ and we have: $\mathbb{E}[\max(X_1, \ldots, X_m)] \approx n \mathbb{E}[\max(U_1, \ldots, U_m)]$ where each $U_i$ is independent Uniform(0,1). The expected maximum of $m$ uniform(0,1) variables is a well-known formula (which we will also derive from scratch shortly in 5.2): $\mathbb{E}[\max(U_1, \ldots, U_m)] = \frac{m}{m+1}$ Thus, the one-term approximation is: $\boxed{\mathbb{E}[\max(X_1, \ldots, X_m)] \approx \frac{m}{m+1} n}$ ### 5.1.1 Expected Maximum of Uniform Random Variables: A Symmetry Argument To understand why the expected maximum of $m$ uniform(0,1) variables equals $\frac{m}{m+1}$, consider the following geometric picture. ![[RollingWithAdvantageScreenshot_013.png]] Plot $m$ points $U_1, U_2, \ldots, U_m$ independently and uniformly on the interval $[0,1]$. The points divide the interval into $m+1$ segments: * $S_1$: from 0 to the smallest point * $S_2$: between the smallest and second smallest point * ... * $S_{m+1}$: from the largest point to 1 The maximum of the points is the rightmost point, and the sum of the lengths of the first $m$ segments equals the maximum point: $\max(U_1, \ldots, U_m) = S_1 + S_2 + \cdots + S_m$ By symmetry, each segment $S_i$ has the same expected length. This symmetry can be observed by wrapping the segment $[0,1]$ on a circle so that we can see any $S_i$ can be mapped to any other $S_j$ by rotation. ![[RollingWithAdvantageScreenshot_015.png]] Since the $m+1$ segments sum to 1, each has expected length $\frac{1}{m+1}$. Therefore: $\mathbb{E}[\max(U_1, \ldots, U_m)] = m \times \frac{1}{m+1} = \frac{m}{m+1}$ ### 5.1.2. Another tangent: The expected value of a Beta random variable As a slight tangent, the argument just given also proves the formula for the expected value of a beta random variable with integer entries. This is because a Beta random variable is simply the $k$-th largest of $m$ uniform $[0,1]$ random variables. $ \mathbb{E}[\text{Beta}(k, m - k +1)] = \mathbb{E}[k\text{-th}\{U_1,U_2,\ldots,U_m\}] = \frac{k}{m+1}$ ![[RollingWithAdvantageScreenshot_017.png]] ## 5.3 Two Term Approximation and the Role of Rounding Our initial continuous approximation ignores the discrete nature of dice rolls. To account for this, consider that a discrete die roll $X$ can be represented as: $X = \lceil n U \rceil$ or equivalently, instead of rounding _up_ on the RHS, we can "round down" on the LHS to see that: $X - \Delta = n U, \quad \Delta \sim \text{Uniform}(0,1)$ where $\Delta$ is a random variable representing the fractional part subtracted from $X$ to get the continuous uniform variable. When taking the maximum over all $m$ dice, the $\Delta_i$ variables introduce a correction. In the simplest case where there is a unique maximum among the dice rolls, this correction subtracts the expected value of a single $\Delta$, which is $\frac{1}{2}$. Thus, including this correction term the two-term approximation becomes: $\boxed{\mathbb{E}[\max(X_1, \ldots, X_m)] \approx \frac{m}{m+1} n + \frac{1}{2}}$ ![[RollingWithAdvantageScreenshot_019.png]] ## 5.4 Three Term Approximation and Handling Ties The third term arises from the possibility of ties for the maximum value, where multiple dice show the same maximum number. When this happens, the maximum of $X_i - \Delta_i$ involves the minimum of several $\Delta_i$ variables, complicating the calculation. For example, suppose the dice come out as the values $6$ twice, then the the maximum after subtracting $\Delta_i$ is: $\max(6 - \Delta_1, 6 - \Delta_2) = 6 - \min(\Delta_1, \Delta_2)$ The expected value of the minimum of two independent uniform(0,1) variables is $\frac{1}{3}$, which differs from the single $\frac{1}{2}$ expected for one $\Delta$. Generalizing, if $N_{\max}$ is the number of dice tied for the maximum, then the expected maximum is: $\mathbb{E}[\max(X_1 - \Delta_1, \ldots, X_m - \Delta_m)] = \mathbb{E}[\max(X_1, \ldots, X_m)] - \mathbb{E}[\min(\Delta_{\ast 1}, \ldots, \Delta_{\ast N_{\text{maxers}}})]$ Fortunately, we know the formula for the minimum of all the $\Delta$ since these are uniform, so the average is just the reciprocal of $N_{\text{maxers}}+1$. This give us a nice formula: $\mathbb{E}[\max(X_1 - \Delta_1, \ldots, X_m - \Delta_m)] = \mathbb{E}[\max(X_1, \ldots, X_m)] - \mathbb{E}[\frac{1}{N_{\text{maxers}}+1}]$ Accounting for the probability of ties leads to the third term correction in terms of the probability that $N_{\text{maxers}}$ is either 2 or higher, which we calculate exactly below. # 6\. Probability of Two Dice Tied for Maximum via Birthday Paradox The probability of two dice tying for the maximum is closely related to the birthday paradox, a classical problem in probability theory that calculates the likelihood of shared birthdays among a group of people. In the dice context, the problem is to find the probability that among $m$ dice rolls, exactly two dice share the same maximum value. We start by looking for any coincidences among the dice, and then refine this to find only coindicences amongst the maximizers. First, note that the probability that all dice have different values is: $\mathbb{P}(\text{all different}) = \prod_{k=0}^{m-1} \left(1 - \frac{k}{n}\right)$ Expanding out this product for large $n$ gives: $\mathbb{P}(\text{all different}) \approx 1 - \frac{m(m-1)}{2n} + O\left(\frac{1}{n^2}\right)$ It can be shown that when the probability of two or more coincidences is $O(1/n^2)$, and so the probability of exactly one coincidence (two dice equal) is approximately: $\mathbb{P}(\text{exactly one pair}) \approx \frac{m(m-1)}{2n} + O\left(\frac{1}{n^2}\right)$ When there is a pair of dice that share the same value, the chance of this pair being equal to the maximizer is $\frac{1}{m^2}$. Therefore, the probability that exactly two dice tie for the maximum value is roughly: $\mathbb{P}(N_{\text{maxers}} = 2) \approx \frac{m}{2n}$ This calculation justifies the third term correction in the expected maximum formula. ![[RollingWithAdvantageScreenshot_021.png]] # 7\. Bonus Fourth Term and Error Analysis As a bonus, the fourth term in the series expansion of the expected maximum in powers of $1/n$ turns out to be zero. This means the error after the third term is of order $1/n^3$, making the three-term approximation extremely precise even for moderately sized dice. This is somewhat surprising and lacks an intuitive explanation, but detailed algebraic calculations confirm that the coefficient for the $1/n^2$ term vanishes. Thus, the refined formula can be written as: $\boxed{\mathbb{E}[\max(X_1, \ldots, X_m)] \approx \frac{m}{m+1} n + \frac{1}{2} - \frac{m}{12 n} + \frac{0}{n^2} + O(\frac{1}{n^3})}$ # FAQ ### Q: What does rolling with Advantage mean in Dungeons & Dragons? A: Rolling with Advantage means rolling a die twice and taking the higher result, effectively giving you two chances to get a better roll. ### Q: How much does rolling with Advantage increase the average roll? A: For a 20-sided die, rolling twice and taking the maximum raises the average from 10.5 to approximately 13.825, an increase of about 3.3. ### Q: What is the general formula for the expected maximum of rolling an n-sided die m times? A: The expected maximum roll is approximately: $\frac{m}{m+1} n + \frac{1}{2} - \frac{m}{12 n}$ This formula is exactly correct for two or three dice ($m=2$ or $m=3$) and is very accurate for more dice as long as $m < n$. ### Q: Why does the formula include the terms 1/2 and -m/12n? A: The $\frac{1}{2}$ term accounts for the rounding effect from continuous to discrete dice rolls, while the $-\frac{m}{12n}$ term corrects for the probability of ties for the maximum roll, related to the birthday paradox. ### Q: How is the birthday paradox connected to this problem? A: The birthday paradox calculates the probability of coincidences (shared birthdays) in a group, which parallels calculating the probability of ties in dice rolls. ### Q: Can this method be used to calculate the expected minimum roll? A: Yes, by exploiting symmetry, the expected minimum can be derived from the expected maximum, leading to a similar formula. ### Q: How accurate is the three-term approximation? A: It is extremely accurate, with errors on the order of $1/n^3$, making it reliable for typical dice sizes. _This article was based on the video [The math of rolling with advantage: Stand-up Maths' max-of-dice conjecture proven!](https://www.youtube.com/watch?v=0d5gCzxD0sg) ._ # Support Me If you like this kind of stuff and you want to support me you can use the ko-fi widget below or buy me a coffee here <a href='https://ko-fi.com/V7V31AWXVZ' target='_blank'><img height='36' style='border:0px;height:36px;' src='https://storage.ko-fi.com/cdn/kofi6.png?v=6' border='0' alt='Buy Me a Coffee at ko-fi.com' /></a>. Thank you! <iframe id='kofiframe' src='https://ko-fi.com/drmihainica/?hidefeed=true&widget=true&embed=true&preview=true' style='border:none;width:100%;padding:4px;background:#f9f9f9;' height='600' title='drmihainica'></iframe>