Jekyll2020-11-15T23:41:59-08:00/feed.xmlJames WatkinsGrad student. Analyst. Tinkerer.Baking Perfect Bacon2019-08-03T19:55:00-07:002019-08-03T19:55:00-07:00/food/2019/08/03/perfect-bacon<p>I keep forgetting at what temperature and for how long to cook bacon.</p>
<p>Let bacon thaw in advance. Frozen bacon is not fun to work with.<br />
Preheat oven on Convection Bake at 400F.<br />
Lay a sheet of parchment on a dark cookie sheet (dark bakes faster). Let it overlap the side of the sheet, so bacon grease is contained (and easy to clean up later).<br />
Lay bacon on pan in a single layer. Flip every second piece to fit more on, if necessary.<br />
Bake for 10 minutes.<br />
Check bacon. At 10 minutes it’s usually still pretty chewy. If you want it crispier, leave it in the oven and check it every 1-2 minutes.</p>
<p>12-13 minutes is usually an ideal half-chewy half-crispy state for BLTs.</p>I keep forgetting at what temperature and for how long to cook bacon.The Probability Distribution of Item Drops in Video Games, Part 22019-07-30T00:24:30-07:002019-07-30T00:24:30-07:00/probability/games/2019/07/30/drop-probability-in-games-2<p>In <a href="/probability/games/2019/07/22/drop-probability-in-games.html">Part 1</a>, we explored probability by calculating our chances of finding <a href="https://www.wowhead.com/item=13335/deathchargers-reins">Rivendare’s Deathcharger</a> in World of Warcraft’s Stratholme dungeon, and explored the <a href="https://en.wikipedia.org/wiki/Binomial_distribution">binomial distribution</a>. Now, we’ll use simulations to test the ‘theoretical’ distribution we generated, and take a look at the expected number of failures before a successful mount drop using the geometric distribution.</p>
<p>To recap, the probability mass function (PMF) for a binomial distribution is given as</p>
<p><img src="/assets/equations/binomial-pmf.png" alt="Binomial Distribution Probability Mass Function" /></p>
<p>Since Rivendare’s Deathcharger (or, more specifically, the Deathcharger’s Reins items which grants this mount) drops at a rate of 0.8% (or 1/125), if we did 125 runs of the dungeon, we can find the probability of seeing the mount drop over that many runs by subtracting the PMF, where k equals 0, from 1. As it happens, we have a 63.4% chance of seeing the mount at least once over 125 dungeon runs. If we plot our probability for each potential value of k (0 through 125), we get a graph that looks like this:</p>
<p><img src="/assets/wow-graphs/binomial-pmf-125.png" alt="Binomial PMF Graph, n = 125" /></p>
<p>Note that this graph is truncated, as there’s little value in visualizing a distribution tail of over one hundred near-zero probabilities.</p>
<p>Since this graph is generated with an equation, and not actual dungeon runs, we might call it ‘theoretical’ instead of real. It’s relying on the law of large numbers; the more dungeon runs we complete, the closer a distribution based on real dungeon runs should match our theoretical one. What if we use real data?</p>
<p>Now, I’m not going to load up WoW and run Stratholme over and over again. I’ve done that enough in the past (Cat Druid, for speed, stealth, and AoE spam when necessary, is a good way to go about it, but you’ll hit the hourly 10 instance cap in about 40 minutes). And since the characteristics of the dungeon are known, we <strong>know</strong> the mount drop should follow a binomial distribution. From past data collection efforts, we know with reasonable certainty the drop rate is 1/125. We know there are only two outcomes, success or failure. We know the probability for each run is constant and independent, because other drop mechanics were not introduced until later parts of the game. In other words, simulating real data in this case is pretty easy. If our probability were 0.5, we could simulate dungeon runs by flipping a coin over and over again. However, we’ll have a lot of trouble finding a (real) coin that only lands on heads once every 125 flips.</p>
<p>If we pop open the Python shell, the <code class="language-plaintext highlighter-rouge">random</code> module can give us such a coin:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">random</span>
<span class="n">random</span><span class="p">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">125</span><span class="p">)</span>
</code></pre></div></div>
<p>This returns an integer between (and including) 1 and 125. If we say a value of 125 is a success, we can simulate dungeon runs over and over by repeating <code class="language-plaintext highlighter-rouge">random.randint(1, 125)</code>. We could write a loop to repeat this line 125 times, or to keep going until we get a value of 125. In Python 3.6 or later, we can use <code class="language-plaintext highlighter-rouge">choices()</code> to make this easier.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">random</span>
<span class="nb">sum</span><span class="p">(</span><span class="n">random</span><span class="p">.</span><span class="n">choices</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.992</span><span class="p">,</span> <span class="mf">0.008</span><span class="p">],</span> <span class="n">k</span> <span class="o">=</span> <span class="mi">125</span><span class="p">))</span>
</code></pre></div></div>
<p>Here we define two choices, 0 for failure and 1 for success, and weight them appropriately (99.2% chance of failure, 0.8% chance of success). Our sample size is k, set to 125, meaning we generate 125 trials. This returns a list of 125 ones and zeros. I’m using <code class="language-plaintext highlighter-rouge">sum()</code> to add up the list and see how many successes we get in 125 runs (usually returning 0, 1, 2, or sometimes 3). So, we have a customizable random number generator at our fingertips with which we can generate binomial simulations quickly, writing right into a shell! If you remember the random module’s functions and arguments, it’s as quick as opening Windows’ calculator app.</p>
<p>Of course, R has this functionality built in too. The <code class="language-plaintext highlighter-rouge">rbinom()</code> function allows us to iterate something like our Python simulation many times:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rbinom</span><span class="p">(</span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>
<p>This piece of code essentially runs 10 players through our dungeon 125 times, with a success probability of 0.008. It’s the equivalent of running our Python code through a for loop in <code class="language-plaintext highlighter-rouge">range(0, 9)</code>. The output is a list with ten numbers denoting the number of successes:</p>
<blockquote>
<p>1 0 2 0 0 1 1 2 0 2</p>
</blockquote>
<p>So, four players didn’t see the mount drop, and 6 saw it drop at least once, which roughly corresponds with our 63.4% figure. As long as we’re not trying to run the simulation on a toaster, it’s easy to scale up, too:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rbinom</span><span class="p">(</span><span class="m">1000</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>
<p>Output:</p>
<blockquote>
<p>1 0 2 0 1 1 1 2 0 1 0 1 1 4 1 1 1 2 2 0 0 1 0 0 2 1 0 2 1 1 0 1 0 0 1 0 0 0 1 0 0 0 2 1 0 2 0 0 5 0 0 0 1 3 0 1 0 1 1 1 0 1 0 0 0 0 1 0 0 1 0 0 2 0 0 0 3 1 0 2 1 0 2 1 2 1 1 2 1 2 1 1 2 0 0 1 3 0 0 0 0 2 1 1 0 0 1 2 2 0 0 1 2 2 2 5 1 1 1 0 1 1 1 1 2 0 0 2 2 1 0 1 1 0 1 0 1 2 1 2 0 0 2 2 1 1 1 1 3 0 2 1 0 0 4 4 1 3 0 2 0 0 1 0 0 1 2 1 0 2 1 0 0 2 1 0 2 0 0 0 1 1 3 4 1 1 0 0 1 2 1 2 1 1 1 0 0 0 2 0 0 1 1 1 1 2 1 0 0 2 2 0 0 0 0 1 1 0 1 1 0 1 1 2 0 1 1 1 1 0 2 1 2 0 2 0 1 2 1 0 2 0 1 0 0 0 1 2 1 1 0 1 1 0 1 0 1 1 1 2 1 3 0 3 0 4 1 2 0 1 1 3 1 2 1 1 1 1 3 3 0 2 0 1 0 0 2 1 0 1 2 1 0 0 2 1 1 2 0 2 1 1 1 1 0 1 1 1 3 0 2 1 2 0 0 1 3 3 4 1 1 1 1 1 0 1 0 2 1 2 1 2 1 1 1 0 3 2 0 0 4 0 2 0 0 5 0 0 0 0 1 2 0 0 2 1 1 1 0 3 0 1 1 3 2 1 0 0 2 2 4 0 1 1 0 0 4 1 0 0 2 1 1 3 1 2 1 0 1 1 4 0 3 0 3 0 0 2 1 1 1 0 1 0 2 0 1 2 1 2 1 3 0 0 1 1 1 2 0 2 1 0 2 2 2 1 0 0 0 0 1 0 1 0 2 2 0 0 1 0 1 2 1 1 0 0 2 0 0 1 0 0 2 0 0 0 1 0 1 2 0 1 2 1 0 1 0 3 1 1 2 2 1 1 2 2 0 1 1 1 0 1 1 1 0 1 2 1 0 0 1 4 1 0 0 1 1 1 1 4 1 0 2 2 0 3 1 0 0 2 1 0 0 2 0 0 1 1 0 1 2 0 1 2 1 0 0 0 0 1 0 0 1 1 1 1 2 1 1 1 3 1 2 0 0 1 2 0 1 2 0 3 1 1 0 1 2 0 1 2 1 2 1 1 0 1 2 2 1 2 1 0 2 0 3 0 1 3 0 0 1 0 1 1 2 2 0 0 2 1 0 3 2 0 1 2 0 1 2 1 1 2 0 1 2 1 0 0 3 1 3 0 3 0 1 1 1 1 0 2 0 2 2 1 1 0 0 2 0 1 1 0 0 1 0 0 3 0 1 2 0 2 0 4 0 0 0 0 1 3 0 3 0 2 0 2 0 0 0 1 2 1 0 0 2 2 1 0 2 2 1 1 1 0 1 1 1 0 1 0 0 1 0 3 1 1 1 0 2 1 1 0 1 1 1 0 0 2 0 1 2 0 0 1 2 2 2 0 1 1 1 3 0 0 0 1 1 1 0 1 2 0 1 3 1 4 1 1 2 4 1 0 1 2 3 0 1 2 0 0 3 2 1 1 1 0 1 0 2 1 1 2 2 1 3 1 1 1 0 0 2 0 0 1 0 1 0 1 0 1 0 0 1 1 1 2 2 1 2 2 0 0 0 1 0 0 0 3 3 1 0 1 1 2 0 1 0 2 3 1 0 1 2 0 2 0 0 1 1 1 0 0 1 1 3 1 1 1 0 2 0 1 2 1 0 0 1 2 0 0 0 1 1 1 1 3 1 0 0 1 1 0 0 0 1 2 1 2 1 1 0 1 1 1 0 1 2 1 1 0 1 0 1 1 2 2 0 2 2 0 1 0 0 0 2 0 2 0 1 0 2 3 1 3 1 0 1 0 2 1 0 2 0 3 1 0 1 2 1 1 0 1 1 1 2 1 0 1 0 0 3 0 0 3 0 1 1 2 0 6 0 0 2 0 1 0 0 1 2 1 1 2 1 1 0 4 3 0 1 0 1 1 2 1 0 2 2 1 3 2 0 0 0 1 3 3 0 0 2 0 1 1 1 1 1 2 0 1 1 0 0 0 0 0 2 3 2 1 1 1 0 0 2 1 0 1 1 1 1 0 0 2 2 1 0 0 0 1 1 0</p>
</blockquote>
<p>Yes, that’s 1000 results of 125 trials each, or 125,000 total runs. I won’t post anymore output vomit like that, but it does demonstrate how much we can do with one line of code. Imagine what the output would look like with 100,000 samples of 125 runs (or, 12,500,000 total trials)? It would be unintelligible; also, an ideal candidate for visualization! Let’s try it:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">successes</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="nf">as.numeric</span><span class="p">(</span><span class="n">rbinom</span><span class="p">(</span><span class="m">100000</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)))</span><span class="w">
</span><span class="n">id</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="nf">as.numeric</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">successes</span><span class="p">)))</span><span class="w">
</span><span class="n">df</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="s2">"id"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="s2">"successes"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">successes</span><span class="p">)</span><span class="w">
</span><span class="n">simHistogram</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ggplot</span><span class="p">(</span><span class="n">df</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">successes</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">geom_histogram</span><span class="p">(</span><span class="n">binwidth</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">
</span><span class="n">alpha</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.60</span><span class="p">,</span><span class="w">
</span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"dark red"</span><span class="p">,</span><span class="w">
</span><span class="n">aes</span><span class="p">(</span><span class="n">fill</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">..count..</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">scale_fill_gradient</span><span class="p">(</span><span class="s2">"Frequency"</span><span class="p">,</span><span class="w"> </span><span class="n">low</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"blue"</span><span class="p">,</span><span class="w"> </span><span class="n">high</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Histogram of Number of Mount Drops"</span><span class="p">,</span><span class="w">
</span><span class="n">subtitle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Based on 100,000 simulations of 125 dungeon runs"</span><span class="p">,</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Number of mount drops"</span><span class="p">,</span><span class="w">
</span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Frequency"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">theme_partywhale</span><span class="p">()</span><span class="w">
</span><span class="n">simHistogram</span><span class="w">
</span></code></pre></div></div>
<p>This generates the following graph:</p>
<p><img src="/assets/wow-graphs/hist125.png" alt="Simulated Mount Drops, n = 125" /></p>
<p>If we divide the y axis by 100,000, our simulated distribution actually matches the theoretical distribution fairly well. If we want to see a summary of the actual numbers, we can do so with a univariate frequency table.</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">histTable</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">table</span><span class="p">(</span><span class="n">successes</span><span class="p">)</span><span class="w">
</span><span class="n">histTable</span><span class="w">
</span></code></pre></div></div>
<p>And we get:</p>
<table>
<thead>
<tr>
<th>Number of mounts</th>
<th>Frequency</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>36,623</td>
</tr>
<tr>
<td>1</td>
<td>37,002</td>
</tr>
<tr>
<td>2</td>
<td>18,525</td>
</tr>
<tr>
<td>3</td>
<td>6,039</td>
</tr>
<tr>
<td>4</td>
<td>1,484</td>
</tr>
<tr>
<td>5</td>
<td>286</td>
</tr>
<tr>
<td>6</td>
<td>37</td>
</tr>
<tr>
<td>7</td>
<td>4</td>
</tr>
</tbody>
</table>
<p>In this simulation, 36.6% of players found the mount zero times after 125 runs, which matches the 36.6% we would expect from the theoretical model. We could re-run this simulation many times with different seeds, and although there will be small variations in the frequencies, we’ll end up with similar probabilities. The most noticible difference between simulations will be how far the tail reaches down the x axis.</p>
<p>What if, instead of simulating 100,000 players doing 125 runs each, we want to simulate 861 runs each? 861 dungeon runs is the point where the probability of seeing the mount drop at least one time reaches 99.9%. Because the overwhelming majority of players are likely to see the mount drop after 861 runs, the shape of our distribution will look different (but still follows a binomial distribution). We can generate the theoretical distribution like so:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">y</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">dbinom</span><span class="p">(</span><span class="m">0</span><span class="o">:</span><span class="m">20</span><span class="p">,</span><span class="w"> </span><span class="m">861</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="p">(</span><span class="m">0</span><span class="o">:</span><span class="m">20</span><span class="p">)</span><span class="w">
</span><span class="n">qplot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">,</span><span class="w"> </span><span class="n">colour</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">(</span><span class="s2">"#F9858F"</span><span class="p">),</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">(</span><span class="m">4</span><span class="p">),</span><span class="w"> </span><span class="n">shape</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">(</span><span class="m">16</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">scale_x_continuous</span><span class="p">(</span><span class="n">breaks</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">seq</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">20</span><span class="p">,</span><span class="w"> </span><span class="m">5</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Binomial: Probability Mass Function"</span><span class="p">,</span><span class="w">
</span><span class="n">subtitle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"where p = 0.008 and n = 861"</span><span class="p">,</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Number of mount drops"</span><span class="p">,</span><span class="w">
</span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Probability"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">theme_partywhale</span><span class="p">()</span><span class="w">
</span></code></pre></div></div>
<p>This generates the following graph:</p>
<p><img src="/assets/wow-graphs/binomial-pmf-861.png" alt="Binomial PMF Graph, n = 861" /></p>
<p>Hypothetically, the majority of players should find the mount 5-8 times over 861 runs. A very small but non-zero number of players still won’t find the mount at all. Now, let’s simulate some dungeon runs. The code is nearly identical to our previous simulation; we just sub in a number in the <code class="language-plaintext highlighter-rouge">rbinom()</code> function and a label:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">successes</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="nf">as.numeric</span><span class="p">(</span><span class="n">rbinom</span><span class="p">(</span><span class="m">100000</span><span class="p">,</span><span class="w"> </span><span class="m">861</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)))</span><span class="w">
</span><span class="n">id</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="nf">as.numeric</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">successes</span><span class="p">)))</span><span class="w">
</span><span class="n">df</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="s2">"id"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="s2">"successes"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">successes</span><span class="p">)</span><span class="w">
</span><span class="n">simHistogram</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ggplot</span><span class="p">(</span><span class="n">df</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">successes</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">geom_histogram</span><span class="p">(</span><span class="n">binwidth</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">
</span><span class="n">alpha</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.60</span><span class="p">,</span><span class="w">
</span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"dark red"</span><span class="p">,</span><span class="w">
</span><span class="n">aes</span><span class="p">(</span><span class="n">fill</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">..count..</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">scale_fill_gradient</span><span class="p">(</span><span class="s2">"Frequency"</span><span class="p">,</span><span class="w"> </span><span class="n">low</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"blue"</span><span class="p">,</span><span class="w"> </span><span class="n">high</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Histogram of Number of Mount Drops"</span><span class="p">,</span><span class="w">
</span><span class="n">subtitle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Based on 100,000 simulations of 861 dungeon runs"</span><span class="p">,</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Number of mount drops"</span><span class="p">,</span><span class="w">
</span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Frequency"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">theme_partywhale</span><span class="p">()</span><span class="w">
</span><span class="n">simHistogram</span><span class="w">
</span></code></pre></div></div>
<p>And we get the following output:</p>
<p><img src="/assets/wow-graphs/hist861.png" alt="Simulated Mount Drops, n = 861" /></p>
<p>Again, dividing our frequencies by 100,000, we end up with proportions very close to the theoretical binomial distribution. We can examine the numbers with a table, as before (though I won’t print the whole thing out here).</p>
<table>
<thead>
<tr>
<th>Number of mounts</th>
<th>Frequency</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>92</td>
</tr>
<tr>
<td>1</td>
<td>715</td>
</tr>
<tr>
<td>2</td>
<td>2,372</td>
</tr>
<tr>
<td>3</td>
<td>5,436</td>
</tr>
<tr>
<td>4</td>
<td>9,668</td>
</tr>
<tr>
<td>5</td>
<td>13,016</td>
</tr>
<tr>
<td>6</td>
<td>15,183</td>
</tr>
<tr>
<td>7</td>
<td>14,856</td>
</tr>
<tr>
<td>8</td>
<td>12,987</td>
</tr>
<tr>
<td>9</td>
<td>9,908</td>
</tr>
</tbody>
</table>
<p>The 0.092% of players who didn’t find the mount is pretty close to the 0.099% we would expect. With larger simulations, these should match even more closely.</p>
<p>One last demonstration: if we have data that follows a binomial distribution and want information on the number of failures to first success (rather than number of successes over a set number of trials), we’ll instead be graphing a geometric distribution. R has built-in functions for the geometric distribution as well. For our purposes, <code class="language-plaintext highlighter-rouge">dgeom()</code> will give us the probability it will take a given number of failures before a success (i.e. a mount drop). For instance:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dgeom</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span><span class="n">dgeom</span><span class="p">(</span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span><span class="n">dgeom</span><span class="p">(</span><span class="m">300</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>
<p>These output 0.007936, 0.002931224, and 0.0007187728 respectively. Or, 0.79% of players should find the mount on their second try, 0.29% of players on their 126th try, and 0.072% on their 301st try. If it’s not already obvious, the probability we find the mount on our first try, or <code class="language-plaintext highlighter-rouge">dgeom(0, 0.008)</code>, is 0.008, our regular chance of success on any one dungeon run! We can plot a ‘theoretical’ geometric distribution with the following code (note that I’ve limited the x axis to 1000; since a success, or in this case a mount drop, cannot be guaranteed, the x axis actually stretches to infinity).</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">geomSingleProb</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="nf">as.numeric</span><span class="p">(</span><span class="n">dgeom</span><span class="p">(</span><span class="m">0</span><span class="o">:</span><span class="m">1000</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)))</span><span class="w">
</span><span class="n">dfGeomDist</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="s2">"geomSingleProb"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">geomSingleProb</span><span class="p">,</span><span class="w"> </span><span class="s2">"x"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="o">:</span><span class="m">1000</span><span class="p">)</span><span class="w">
</span><span class="n">simGeomHistogram</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ggplot</span><span class="p">(</span><span class="n">dfGeomDist</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">geomSingleProb</span><span class="p">,</span><span class="w"> </span><span class="n">fill</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">geomSingleProb</span><span class="p">,</span><span class="w"> </span><span class="n">width</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">geom_col</span><span class="p">(</span><span class="n">alpha</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.60</span><span class="p">,</span><span class="w">
</span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NA</span><span class="w">
</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">scale_fill_gradient</span><span class="p">(</span><span class="s2">"Probability"</span><span class="p">,</span><span class="w"> </span><span class="n">low</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"blue"</span><span class="p">,</span><span class="w"> </span><span class="n">high</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Geometric Distribution"</span><span class="p">,</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Number of Trials"</span><span class="p">,</span><span class="w">
</span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Probability"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">theme_partywhale</span><span class="p">()</span><span class="w">
</span><span class="n">simGeomHistogram</span><span class="w">
</span></code></pre></div></div>
<p>And we get this graph:</p>
<p><img src="/assets/wow-graphs/geom-dist.png" alt="Geometric Distribution" /></p>
<p>As with the binomial examples, I’ve run a simulation to see how well real (well, simulated, but as good as real) data stacks up. I’ve thought of two approaches here: specifying a number of players who run the dungeon until success, or specifying a blanket number of dungeon runs and seeing how many players succeed in finding the mount. If we want to specify the number of players, we can use <code class="language-plaintext highlighter-rouge">rgeom()</code>:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">g</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">rgeom</span><span class="p">(</span><span class="m">100000</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span><span class="n">mean</span><span class="p">(</span><span class="n">g</span><span class="p">)</span><span class="w">
</span><span class="n">median</span><span class="p">(</span><span class="n">g</span><span class="p">)</span><span class="w">
</span><span class="n">table</span><span class="p">(</span><span class="n">g</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>
<p>First, I assign the <code class="language-plaintext highlighter-rouge">rgeom()</code> function, and take a look at some of the data it generates. The mean is about 124 and the median about 86, which is roughly what is expected given the success probability of 1/125 and the shape of a geometric distribution. <code class="language-plaintext highlighter-rouge">table(g)</code> outputs a frequency table, which demonstrates that players who require a greater number of runs to success are represented less than those who require fewer runs. Of course, we can plot this:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">gNum</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="nf">as.numeric</span><span class="p">(</span><span class="n">g</span><span class="p">))</span><span class="w">
</span><span class="n">df</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="s2">"gNum"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">gNum</span><span class="p">)</span><span class="w">
</span><span class="n">simhist</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ggplot</span><span class="p">(</span><span class="n">df</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">gNum</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">geom_histogram</span><span class="p">(</span><span class="n">binwidth</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">
</span><span class="n">alpha</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.60</span><span class="p">,</span><span class="w">
</span><span class="c1">#color = "dark red",</span><span class="w">
</span><span class="n">aes</span><span class="p">(</span><span class="n">fill</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">..count..</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">scale_fill_gradient</span><span class="p">(</span><span class="s2">"Frequency"</span><span class="p">,</span><span class="w"> </span><span class="n">low</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"blue"</span><span class="p">,</span><span class="w"> </span><span class="n">high</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Histogram of Number of Trials to First Success"</span><span class="p">,</span><span class="w">
</span><span class="n">subtitle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Based on 100,000 simulations"</span><span class="p">,</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Trials Required"</span><span class="p">,</span><span class="w">
</span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Frequency"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">theme_partywhale</span><span class="p">()</span><span class="w">
</span><span class="n">simhist</span><span class="w">
</span></code></pre></div></div>
<p>This outputs the following graph:</p>
<p><img src="/assets/wow-graphs/rgeom-plot.png" alt="Geometric Distribution Simulation, player-based" /></p>
<p>We can also find out exactly how many dungeon runs take place in this simulation (rather than number of players we specified) by summing our data, i.e. <code class="language-plaintext highlighter-rouge">sum(g)</code>. In this case, it comes out to roughly 12.4 million dungeon runs, which should be no surprise given we had 100,000 players and a mean around 124 (100,000 * 124 = 12,400,000).</p>
<p>Now, the other way around. Let’s say we start with a set number of total dungeon runs:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">p</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0.008</span><span class="w">
</span><span class="n">trials</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">runif</span><span class="p">(</span><span class="m">1e6</span><span class="p">)</span><span class="w"> </span><span class="o"><</span><span class="w"> </span><span class="n">p</span><span class="w">
</span><span class="n">sim</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">diff</span><span class="p">(</span><span class="nf">c</span><span class="p">(</span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> </span><span class="n">which</span><span class="p">(</span><span class="n">trials</span><span class="p">)))</span><span class="w">
</span><span class="n">geomMean</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">mean</span><span class="p">(</span><span class="n">sim</span><span class="p">)</span><span class="w">
</span><span class="n">geomMedian</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">median</span><span class="p">(</span><span class="n">sim</span><span class="p">)</span><span class="w">
</span><span class="n">table</span><span class="p">(</span><span class="n">sim</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>
<p>This specifies our probability and how many trials we want to use (1 million), resulting in a dataset similar to <code class="language-plaintext highlighter-rouge">rgeom()</code> without specifying how many players are involved. I also grab the mean and median for later; in this simulation they were about 125 and 87 respectively. Again, I looked at the frequencies with <code class="language-plaintext highlighter-rouge">table()</code>. Finally, I plot it, with the addition of dotted lines representing the mean and median:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">simnum</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="nf">as.numeric</span><span class="p">(</span><span class="n">sim</span><span class="p">))</span><span class="w">
</span><span class="n">df2</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="s2">"simnum"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">simnum</span><span class="p">)</span><span class="w">
</span><span class="n">simhist</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ggplot</span><span class="p">(</span><span class="n">df2</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">simnum</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">geom_histogram</span><span class="p">(</span><span class="n">binwidth</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">
</span><span class="n">alpha</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.60</span><span class="p">,</span><span class="w">
</span><span class="c1">#color = "dark red",</span><span class="w">
</span><span class="n">aes</span><span class="p">(</span><span class="n">fill</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">..count..</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">scale_fill_gradient</span><span class="p">(</span><span class="s2">"Frequency"</span><span class="p">,</span><span class="w"> </span><span class="n">low</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"blue"</span><span class="p">,</span><span class="w"> </span><span class="n">high</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">geom_vline</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">xintercept</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">geomMean</span><span class="p">),</span><span class="w">
</span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"deepskyblue"</span><span class="p">,</span><span class="w">
</span><span class="n">linetype</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"dashed"</span><span class="p">,</span><span class="w">
</span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">geom_text</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">geomMean</span><span class="p">,</span><span class="w"> </span><span class="m">72</span><span class="p">,</span><span class="w"> </span><span class="n">label</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">round</span><span class="p">(</span><span class="n">geomMean</span><span class="p">,</span><span class="w"> </span><span class="m">1</span><span class="p">),</span><span class="w"> </span><span class="n">hjust</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">-0.25</span><span class="p">),</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"deepskyblue"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">geom_text</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">geomMean</span><span class="p">,</span><span class="w"> </span><span class="m">75</span><span class="p">,</span><span class="w"> </span><span class="n">label</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Mean"</span><span class="p">,</span><span class="w"> </span><span class="n">hjust</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">-0.30</span><span class="p">),</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"deepskyblue"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">geom_vline</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">xintercept</span><span class="o">=</span><span class="n">geomMedian</span><span class="p">),</span><span class="w">
</span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"darkolivegreen2"</span><span class="p">,</span><span class="w">
</span><span class="n">linetype</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"dashed"</span><span class="p">,</span><span class="w">
</span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">geom_text</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">geomMedian</span><span class="p">,</span><span class="w"> </span><span class="m">72</span><span class="p">,</span><span class="w"> </span><span class="n">label</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">round</span><span class="p">(</span><span class="n">geomMedian</span><span class="p">,</span><span class="w"> </span><span class="m">1</span><span class="p">),</span><span class="w"> </span><span class="n">hjust</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1.75</span><span class="p">),</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"darkolivegreen2"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">geom_text</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">geomMedian</span><span class="p">,</span><span class="w"> </span><span class="m">75</span><span class="p">,</span><span class="w"> </span><span class="n">label</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Median"</span><span class="p">,</span><span class="w"> </span><span class="n">hjust</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1.2</span><span class="p">),</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"darkolivegreen2"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Histogram of Number of Trials to First Success"</span><span class="p">,</span><span class="w">
</span><span class="n">subtitle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Based on 1,000,000 simulations"</span><span class="p">,</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Trials Required"</span><span class="p">,</span><span class="w">
</span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Frequency"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">theme_partywhale</span><span class="p">()</span><span class="w">
</span><span class="n">simhist</span><span class="w">
</span></code></pre></div></div>
<p>This outputs the following graph:</p>
<p><img src="/assets/wow-graphs/hist-num-trials.png" alt="Geometric Distribution Simulation, trial-based" /></p>
<p>If we sum the data, i.e. <code class="language-plaintext highlighter-rouge">sum(trials)</code>, we’ll find that this only represents 7,978 players. As such, the graph is a little more erratic! With a larger sample, it would smooth out. We’d need to specify about 12.5 million dungeon runs for size of this simulation to match the previous, player-based one. That said, both simulations appear to be fairly good approximations of the geometric distribution.</p>
<p>A note on reproducibility: since the data used in simulations are generated randomly, every time we run the simulation we will get slightly different results (or vastly different, if we run a small simulation). Random values in R are not truly random; R uses a pseudorandom number generator utilizing the Mersenne-Twister algorithm, which begins from a seed value. If we want to be able to reproduce a simulation exactly, we can specify a seed value at the start of our simulation using <code class="language-plaintext highlighter-rouge">set.seed()</code>. For example, if I run <code class="language-plaintext highlighter-rouge">rbinom()</code> five times, I get five different results:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rbinom</span><span class="p">(</span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span><span class="c1">#-> 0 0 4 4 0 1 1 0 2 1</span><span class="w">
</span><span class="n">rbinom</span><span class="p">(</span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span><span class="c1">#-> 0 0 1 0 2 0 0 2 0 1</span><span class="w">
</span><span class="n">rbinom</span><span class="p">(</span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span><span class="c1">#-> 0 1 1 2 1 1 2 2 2 1</span><span class="w">
</span><span class="n">rbinom</span><span class="p">(</span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span><span class="c1">#-> 0 0 1 1 1 1 0 0 1 3</span><span class="w">
</span><span class="n">rbinom</span><span class="p">(</span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span><span class="c1">#-> 2 1 1 0 1 0 0 1 0 2</span><span class="w">
</span></code></pre></div></div>
<p>If I run <code class="language-plaintext highlighter-rouge">rbinom()</code> with a seed value, I will get the same result every time. I could email the R code to someone else, and they’ll get the exact same result on their computer, too:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">set.seed</span><span class="p">(</span><span class="m">1867</span><span class="p">)</span><span class="w">
</span><span class="n">rbinom</span><span class="p">(</span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span><span class="c1">#-> 0 1 0 1 1 1 0 0 1 0</span><span class="w">
</span></code></pre></div></div>
<p>I did not specify a seed when I ran the above simulations. Since the sample size is large and the data follow known distributions, this isn’t really a problem for reproducibility. Although it’s very unlikely someone else will reproduce these simulations exactly, any similarly-large simulation will arrive at values that are roughly the same. If we wanted to, we could run and record the mean of many simulations, a plot of which should be approximately normal, calculate standard deviation, and construct a confidence interval, but I’ll leave it at this for now. We’ve already learned way more than anyone should ever need to know about the Deathcharger’s Reins drop rate.</p>In Part 1, we explored probability by calculating our chances of finding Rivendare’s Deathcharger in World of Warcraft’s Stratholme dungeon, and explored the binomial distribution. Now, we’ll use simulations to test the ‘theoretical’ distribution we generated, and take a look at the expected number of failures before a successful mount drop using the geometric distribution.The Probability Distribution of Item Drops in Video Games, Part 12019-07-22T05:52:30-07:002019-07-22T05:52:30-07:00/probability/games/2019/07/22/drop-probability-in-games<p>There is some confusion in online video game communities about how probability works, particularly in regard to item drop chances. This seems to happen wherever players have a very low chance of obtaining something they want, and are willing to play through content repeatedly to obtain it. I’ve noticed this in Path of Exile with the Labyrinth enchantments, where there are 362 possible helmet enchantments, each with an equal probability of occuring (or a 1/362 chance of any particular enchantment occuring). This also happens in World of Warcraft (WoW), where many of the highly prized mounts have a 0.5-4% chance of dropping in a dungeon or raid playthrough.</p>
<p>If we had a WoW mount with a 1% chance of dropping, it is tempting to say that we simply need to run the dungeon 100 times to have a 100% chance of obtaining the mount. Many people think of probability this way, and are disappointed when it takes them many more runs through the content to finally get what they’re looking for. If the chance of something happening is 1%, or 1/100, this doesn’t guarantee that it will happen to a particular player after 100 trials. Rather, it follows the law of large numbers. Player A might find the mount after a single run, Player B after 35 runs, Player C after 500 runs, but on average, over tens or hundreds of thousands of runs, the mount will be found about once in every hundred runs.</p>
<p>So, how do we find our actual likelihood of finding an item (or, outcome) over a number of runs (or, trials)? I will use <a href="https://www.wowhead.com/item=13335/deathchargers-reins">Deathcharger’s Reins</a> in WoW to demonstrate. Deathcharger’s Reins is an item that drops off Baron Rivendare, the final boss of the Stratholme dungeon, and awards the ‘Rivendare’s Deathcharger’ mount. Although now owned by about 20% of players, this mount was once highly sought-after as it was the only means Alliance players had to ride an undead horse, and there are plenty of mount-collectors still looking for it. Finding this mount can be frustrating because, according to Wowhead, it drops about 0.8% of the time (or, a probability of 0.008).</p>
<blockquote>
<p><img src="/assets/rivendares-deathcharger.jpg" alt="Rivendare's Deathcharger" title="Rivendare's Deathcharger" />
Rivendare’s Deathcharger in action, being ridden by an Alliance night elf player. Image courtesy of Wowhead.</p>
</blockquote>
<p>In order to find the probability, there are a couple characteristics of our ‘mount trials’ worth considering. First, the probability must always be a number between 0 and 1. A negative probability would indicate a probability that is even lower than ‘impossible’, and a probability greater than one would indicate a probability higher than ‘absolute certainty’; frankly, these values wouldn’t make any sense, and if we end up with a probability outside the 0-1 range we’ve probably done something wrong. Second, there are only two possible outcomes for every trial (either we found the mount or we didn’t), and the probability of success doesn’t change across trials. Unlike, say, the Legion legendary items, there is no ‘bad luck protection’ at play which increases the probability of success after each failure. There are no other factors, such as dungeon completion time, which can influence the drop rate of Deathcharger’s Reins off of Baron Rivendare, either.</p>
<p>Now, it is tempting to go for the pitfall earlier where we sum the individual probability of each run. Our equation would look something like this:</p>
<p><img src="/assets/equations/additive-probability.png" alt="Additive Probability" /></p>
<p>Where P is our total probability, p is our probability of success on a particular run, for each n run. Or, p(1) + p(2) + … p(n).</p>
<p>With a drop rate of 0.8%, we would need 125 runs for our added probabilities to sum to 1, or 100%. However, what if we did 126 runs? Our probability would equal 1.008. On run 127 our probability would equal 1.016. And so on. We can’t have a probability above 1, meaning we can’t sum probabilities in this way.</p>
<p>What if we try multiplying our probabilities? Probabilities are often multiplied when comparing events or trials. For instance, the probability of flipping a coin twice and landing on heads both times is the product of the probability of landing heads on each flip, or 0.5 * 0.5 = 0.25. Our equation would look something like this:</p>
<p><img src="/assets/equations/exponential-probability.png" alt="Exponential Probability" /></p>
<p>Our total probability is P, our probability of success for a run is p, and we have n number of runs.</p>
<p>If we try this over a small number of runs, we come up with another problem:</p>
<blockquote>
<p>0.008 * 0.008 = 0.000064<br />
0.008 * 0.008 * 0.008 = 0.000000512<br />
etc.</p>
</blockquote>
<p>Our probability gets smaller over a greater number of runs! If we followed this through to 125 runs like in the previous example, we would end up with an infinitesimally small probability of 7.696 * 10^-263, which may as well be 0. What we’ve actually done here is calculated the probability of success every run. So, we have a 0.8% chance of finding the mount once, a 0.0064% chance of finding the mount twice in a row, a 0.0000512% chance of finding the mount three times in a row, and so on. Unlike the previous example where we added probabilities together, this does give us useful information, but it’s not the information we’re looking for.</p>
<p>However, we can apply this same method to our problem, we just need to invert our thinking. Instead of working with the probability of success, we can use the probability of failure. Since our failure probability is less than 1, multiplying the failure probability for each attempt should result in an ever-decreasing number. Instead of finding the probability of success every run, we are finding the probability of failure every run. Since probability is expressed as a proportion with range 0-1, the probability of having a success over a given number of runs is the inverse of our probability of failure every run. Our equation would look something like this:</p>
<p><img src="/assets/equations/probability-over-n.png" alt="Probability Over n Runs" /></p>
<p>Each run has a 0.008 probability of success (0.8%, or 1/125), so the probability of failure is 1 - 0.008 = 0.992 (or 99.2%, or 124/125) for each run. If we were to run the dungeon 125 times, what would our probability of success or failure be?</p>
<blockquote>
<p>1 - (probability of failure)^number of runs<br />
1 - (0.992)^125<br />
1 - 0.366<br />
0.634</p>
</blockquote>
<p>In other words, after 125 runs of the Stratholme dungeon, there is a 63.4% chance we will have seen Deathcharger’s Reins drop off Baron Rivendare, and a 36.6% chance we won’t see it drop. Or perhaps more accurately, 63.4% of players should see the mount drop by time they have reached 125 runs. If we keep inputting different values for n (number of runs), it’s noticable that the relationship between number of runs and probability of success is non-linear. It takes 20 runs to reach 15%, 45 runs to reach 30%, 87 runs to reach 50%, 173 runs to reach 75%, 287 runs to reach 90%, 373 to reach 95%, 574 runs to reach 99%, 861 runs to reach 99.9%, and so on. We’ll never reach a probability of 1, except through rounding, because we can’t guarantee we’ll ever have a success (even a 50/50 coin flip can have a long streak of only-heads or only-tails if we flip enough times).</p>
<p>In fact, what we’re actually describing is the <a href="https://en.wikipedia.org/wiki/Binomial_distribution">binomial distribution</a>. The probability mass function of the binomial distribution describes the probability of getting a particular number of successes, given a number of attempts with unchanging independent probability of success. The probability mass function for a binomial distribution is:</p>
<p><img src="/assets/equations/binomial-pmf.png" alt="Binomial Distribution Probability Mass Function" /></p>
<p>Where k is the number of successes, n is the number of runs, and p is the probability of success in any particular run. The ‘!’ symbol indicates a factorial, so if n = 7, n! = 7 * 6 * 5 * 4 * 3 * 2 * 1 (since not everyone is familiar with factorials).</p>
<p>The binomial distribution describes the possibility of a given number of successes, rather than just one success. It’s certainly possible for us to find the Deathcharger’s Reins more than once over 125 runs. Personally I’ve seen the mount drop twice within 100 runs, and have seen other rare mounts drop in quick succession. When we found the probability of success over 125 attempts (63.4%), we were really finding the cumulative probability of having <strong>at least</strong> one success via the binomial distribution.</p>
<p>We can use the R language to find and graph our probabilities without too much trouble. Even though the math is fairly simple, it’s much easier and quicker to have a computer do it than to calculate by hand.</p>
<p>The R function <code class="language-plaintext highlighter-rouge">dbinom()</code> gives the probability density for a given value of k. Probability density and probability mass (the term I used earlier) mean the same thing, but different disciplines have their own conventions and terminology. This function takes three arguments: k, n, and p. If we set k = 0, the function returns the probability we find the mount zero times. If we want the probability of finding the mount at least once, we invert the result:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="m">1</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">dbinom</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>
<p>This outputs ‘0.633597’, the same value (rounded to fewer digits) that we calculated by hand earlier. If we want to specify multiple values of k, we can do so, and R will give us each individual value. Here, we find the probability of finding exactly 1, 2, 3… up to 7 mounts in 125 runs.</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dbinom</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">7</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span><span class="c1"># Output:</span><span class="w">
</span><span class="c1"># [1] 0.3693578615 0.1846789308 0.0610631948 0.0150195762 0.0029312399</span><span class="w">
</span><span class="c1"># [6] 0.0004727806 0.0000648167</span><span class="w">
</span></code></pre></div></div>
<p>Note that the sum of the outputted values is ‘0.6335883973’, very close to the ‘0.633597’ value outputted by the previous chunk of code, and to the value we calculated by hand. We could specify greater values of k, but the probabilities become so small there’s no point in outputting them all. Instead, we can have R sum them for us and not worry about it. Note: this returns the same result as <code class="language-plaintext highlighter-rouge">1 - dbinom(0, 125, 0.008)</code>:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">sum</span><span class="p">(</span><span class="n">dbinom</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">))</span><span class="w">
</span></code></pre></div></div>
<p>So again, our output is ‘0.633597’. A typical player running Stratholme 125 has a 0.634 probability, or 63.4% chance, of having the Deathcharger’s Reins drop. Now, this is fine and all, but visualizing our data can help us understand a lot more about it a lot quicker than a summed range, or heaven forbid a table of potentially hundreds of probabilities for values of k. We can use ggplot2, a visualization package (and part of the Tidyverse group of packages), to help make things more understandable. The following graphs also use <a href="https://github.com/partywhale/r-scripts/blob/master/theme_partywhale.R">theme_partywhale</a>, a simple but clean modification of ggplot2’s theme_grey theme.</p>
<p>Setting y to our <code class="language-plaintext highlighter-rouge">dbinom()</code> probabilities, and x to the k values we want, the code for a quick binomial probability mass function is as follows:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">y</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">dbinom</span><span class="p">(</span><span class="m">0</span><span class="o">:</span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="p">(</span><span class="m">0</span><span class="o">:</span><span class="m">10</span><span class="p">)</span><span class="w">
</span><span class="n">qplot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">,</span><span class="w"> </span><span class="n">colour</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">(</span><span class="s2">"#F9858F"</span><span class="p">),</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">(</span><span class="m">4</span><span class="p">),</span><span class="w"> </span><span class="n">shape</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">(</span><span class="m">16</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">scale_x_continuous</span><span class="p">(</span><span class="n">breaks</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">seq</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">2</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Binomial: Probability Mass Function"</span><span class="p">,</span><span class="w">
</span><span class="n">subtitle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"where p = 0.008 and n = 125"</span><span class="p">,</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Number of mount drops"</span><span class="p">,</span><span class="w">
</span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Probability"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">theme_partywhale</span><span class="p">()</span><span class="w">
</span></code></pre></div></div>
<p>Which gives us the following graph:</p>
<p><img src="/assets/wow-graphs/binomial-pmf-125.png" alt="Binomial PMF Graph, n = 125" /></p>
<p>So, according to our graph, we have just over a 0.35 probability of having no mount drop, a similar probability of finding the mount exactly once, just under 0.2 probability of finding it twice, and just over 0.05 probability of finding it three times. Other values of k are so small as to be negligible (which is why I haven’t calculated all 126 possible values of k). Even without seeing the exact values, this gives us a good sense of relative probabilities for k, and with a little mental arithmetic we know our cumulative probability for k > 0 will be just over 0.6.</p>
<p>We can also use <code class="language-plaintext highlighter-rouge">pbinom()</code>, which takes the same arguments as <code class="language-plaintext highlighter-rouge">dbinom()</code>, to plot the cumulative distribution function. This will give us the probability of having k or fewer successes. The equation for the cumulative distribution function is:</p>
<p><img src="/assets/equations/binomial-cdf.png" alt="Binomial Distribution Cumulative Distribution Function" /></p>
<p>And the code for our plot:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">y</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">pbinom</span><span class="p">(</span><span class="m">0</span><span class="o">:</span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">)</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="o">:</span><span class="m">10</span><span class="w">
</span><span class="n">qplot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">,</span><span class="w"> </span><span class="n">colour</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">(</span><span class="s2">"#F9858F"</span><span class="p">),</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">(</span><span class="m">4</span><span class="p">),</span><span class="w"> </span><span class="n">shape</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">(</span><span class="m">16</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">scale_x_continuous</span><span class="p">(</span><span class="n">breaks</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">seq</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">2</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">scale_y_continuous</span><span class="p">(</span><span class="n">breaks</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">seq</span><span class="p">(</span><span class="m">0.00</span><span class="p">,</span><span class="w"> </span><span class="m">1.00</span><span class="p">,</span><span class="w"> </span><span class="m">0.20</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">expand_limits</span><span class="p">(</span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Binomial: Cumulative Distribution Function"</span><span class="p">,</span><span class="w">
</span><span class="n">subtitle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"where p = 0.008 and n = 125"</span><span class="p">,</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Number of mount drops"</span><span class="p">,</span><span class="w">
</span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Probability"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">theme_partywhale</span><span class="p">()</span><span class="w">
</span></code></pre></div></div>
<p>Which gives the following graph:</p>
<p><img src="/assets/wow-graphs/binomial-cdf-125.png" alt="Binomial CDF Graph, n = 125" /></p>
<p>So our probability of finding 0 mounts is just over 0.35, 0-1 mounts is about 0.75, 0-2 mounts is just over 0.9, and 0-3 mounts is nearly 1. Although further values for k look like they’re at 1 on the graph, they only approach very closely to 1. If our graph went to k = 125, our probability of finding between 0 and 125 mounts over 125 runs must equal 1 (as we cannot find fewer than 0 or greater than 125). However, applied in this way, a cumulative distribution function isn’t very useful because it’s awkward to interpret. As before, we can subtract our values by one to get the probability of finding a number of mounts greater than k:</p>
<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">y</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">pbinom</span><span class="p">(</span><span class="m">0</span><span class="o">:</span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">125</span><span class="p">,</span><span class="w"> </span><span class="m">0.008</span><span class="p">,</span><span class="w"> </span><span class="n">lower.tail</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">)</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="o">:</span><span class="m">10</span><span class="w">
</span><span class="n">qplot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">,</span><span class="w"> </span><span class="n">colour</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">(</span><span class="s2">"#F9858F"</span><span class="p">),</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">(</span><span class="m">4</span><span class="p">),</span><span class="w"> </span><span class="n">shape</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">(</span><span class="m">16</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">scale_x_continuous</span><span class="p">(</span><span class="n">breaks</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">seq</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">2</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Binomial: Inverse Cumulative Distribution Function"</span><span class="p">,</span><span class="w">
</span><span class="n">subtitle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"where p = 0.008 and n = 125"</span><span class="p">,</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Number of mount drops, greater than"</span><span class="p">,</span><span class="w">
</span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Probability"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">theme_partywhale</span><span class="p">()</span><span class="w">
</span></code></pre></div></div>
<p>Alternatively, we could use <code class="language-plaintext highlighter-rouge">1 - pbinom(0:10, 125, 0.008)</code>, but R has this built-in via the lower.tail parameter. We get the following graph:</p>
<p><img src="/assets/wow-graphs/binomial-icdf-125.png" alt="Binomial ICDF Graph, n = 125" /></p>
<p>Here, the probability that we will find greater than 0 mounts (or, 1+ mounts) is just over 0.6, or 0.633597 to be exact. The probability that we will find greater than 1 mount (or, 2+ mounts) looks to be around 0.27, and so on. This graph is still a little clunky, and we would want to fix the labels (particularly on the x-axis) to make interpreting it a little easier. But, ultimately, with a little polishing this is information we can show anyone and be (reasonably) understood without having to explain a binomial distribution.</p>
<p>Next we’ll use simulations to test our perfect, or ‘theoretical’, binomial distributions, and take a look at the expected number of failures before a successful mount drop using the geometric distribution. But for now, this post is long enough!</p>
<p><a href="/probability/games/2019/07/30/drop-probability-in-games-2.html">Onward to Part 2!</a></p>There is some confusion in online video game communities about how probability works, particularly in regard to item drop chances. This seems to happen wherever players have a very low chance of obtaining something they want, and are willing to play through content repeatedly to obtain it. I’ve noticed this in Path of Exile with the Labyrinth enchantments, where there are 362 possible helmet enchantments, each with an equal probability of occuring (or a 1/362 chance of any particular enchantment occuring). This also happens in World of Warcraft (WoW), where many of the highly prized mounts have a 0.5-4% chance of dropping in a dungeon or raid playthrough.Understanding the Rubik’s Cube2019-06-30T23:37:58-07:002019-06-30T23:37:58-07:00/puzzles/2019/06/30/understanding-the-rubiks-cube<p>You’ve probably played with a ‘15 puzzle’ or ‘8 puzzle’ sliding puzzle before, and likely even solved it. I’ve owned many cheap plastic 8 puzzles over the years, since they were a staple of birthday party goodie-bags. An 8-year-old can figure out the 8 puzzle within a couple of minutes. If you’re not familiar, sliding puzzles involve sliding numbered tiles around a board until the numbers are in order (top to bottom, left to right), with the blank space in the bottom-right corner.</p>
<blockquote>
<p><img src="/assets/slidingpuzzle.jpg" alt="15 Puzzle" title="15 Puzzle" />
The ‘15 Puzzle’, invented in 1874.</p>
</blockquote>
<p>Sliding puzzles are a form of 2-dimensional combination puzzle. They are 2-dimensional because we slide tiles along two axes (horizontal, or x; vertical, or y), and they are combination puzzles because the objective is to reorder the tiles into a particular combination from a random state. The rules of the game are simple. We move tiles along x and y, swapping pieces until the puzzle is solved. Each tile has one (and only one) position it must occupy to be considered solved. We can’t pop tiles out of the board and replace them somewhere else. Of course, the axes are static and don’t mysteriously rearrange themselves.</p>
<blockquote>
<p><img src="/assets/15puzzle-axes.png" alt="15 Puzzle" title="15 Puzzle" />
The two axes of the 15 puzzle, x and y..</p>
</blockquote>
<p>It’s impossible to move a piece around the board without affecting other pieces, since there’s only one open space to swap tiles into. As long as our randomized scramble is done from a solved state, the puzzle will always be solvable. If we scramble by popping out the pieces and reinserting them randomly, only half of all possible scrambles will be solvable; our last two pieces can’t be flipped without affecting other pieces.</p>
<p>The above photo of the 15 Puzzle is actually unsolvable - there’s no combination of moves that can solve it, unless we pop out some of the tiles. This only happens if the puzzle is assembled incorrectly by removing and reinserting tiles.</p>
<p>If we know and follow the rules of our sliding puzzle, any scramble that follows those rules is solvable without very much difficulty.</p>
<blockquote>
<p><img src="/assets/rubiks-axes.png" alt="Cube Axes" title="Cube Axes" />
Pieces on the Rubik’s cube move by rotating along three axes: x, y, and z. Like the tiles in a sliding puzzle, the pieces of a Rubik’s cube are not firmly attached to the axes. They lock into one another, allowing the pieces to move freely between layers. The structural integrity of the puzzle requires that all the pieces be hooked into one another; if one piece is pried out, the whole puzzle can collapse.</p>
</blockquote>
<p>The Rubik’s cube (or generically, the ‘magic cube’) is essentially the same puzzle in 3 dimensions. In the public imagination the cube is complicated and surrounded by mystery, but it is actually very simple when we consider the rules it follows. It has three axes (x, y, z) which the layers of the cube rotate around. The centre pieces of each color are bound to their axis, and never move in relation to one another; yellow is always opposite white, blue opposite green, and red opposite orange. If we place white on top and green in front, the left centre piece will always be orange, and the right centre piece will always be red. Consequently, there is only one position that each piece can occupy to be considered solved: the white-orange-green corner piece <em>has</em> to occupy the corner between the white, orange, and green centres, and the yellow-blue edge piece <em>has</em> to occupy the edge between the yellow and blue centres, and so on.</p>
<blockquote>
<p><img src="/assets/cubecore.png" alt="Cube Core" title="Cube Core" />
The core of a cube (I believe this is from a GAN cube). The centre pieces have been removed, but this firmly demonstrates that the centres cannot move. They only rotate on their axis!</p>
</blockquote>
<p>Like the numbers on a sliding puzzle, the colors only help denote the position each piece is supposed to occupy. They’re a rainbow coordinate system. As such, it’s helpful to think of the puzzle in terms of solving pieces or layers of pieces, rather than solving faces. I’ve often seen someone ‘solve’ a single face only to find the pieces were arranged in the wrong order! Each piece gives us two or three bits of information about where it belongs, and we need to utilize all of this information to arrive at a solution. The rule here is: the base unit is the piece, not the coloured stickers on the pieces.</p>
<p>Like the sliding puzzle, it’s impossible to move pieces around the cube without affecting other pieces, and any scramble done from a solved state will be solvable. Also like the sliding puzzle, if the cube is scrambled by popping out pieces and reinserting them in a random order, or by rearranging the stickers, there’s a good chance it won’t be solvable and we’ll always end up with some pieces flipped the wrong way around (since every move affects other pieces, flipping a piece the right way will flip another piece the wrong way). On a 3x3x3 cube, only one in twelve scrambles done this way are solvable.</p>
<p>Since the cube is rule-bound and its characteristics can be understood, it’s possible to develop systematic methods for solving the cube. Indeed, advanced cube solving methods like <a href="https://www.speedsolving.com/wiki/index.php/CFOP_method">CFOP</a>, <a href="https://www.speedsolving.com/wiki/index.php/Roux_method">Roux</a>, and <a href="https://www.speedsolving.com/wiki/index.php/Petrus_Method">Petrus</a> break the solution down into phases, and rely on a combination of intuitive solving (figuring out the solution on your own) and algorithmic solving (memorizing a series of moves which efficiently solve recognizable patterns).</p>
<p>Understanding the cube also allows us to determine exactly how many unique scrambles are possible. In 1981, the original Rubik’s cube was marketed as having over three billion combinations. While this is technically true, the cube has many, many more combinations than three billion! In calculating this, we need to account for both the position and orientation of each piece.</p>
<p>Since there are eight corners, there are 8! (8-factorial) ways to arrange them, or 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1. Once we’ve placed our first corner, there are seven other positions remaining for the rest of the corners. Once we place the second corner, there are 6 positions remaining, and so on. All in all, there are 40,320 possible corner arrangements!</p>
<blockquote>
<p>Combinations = 8! …</p>
</blockquote>
<p>Each corner has three sides, and thus three ways it can be oriented. That white-orange-green corner piece could be white-up, orange-up, or green-up. Since there are 8 corners with 3 orientations each, this would be 3^8 possible orientation combinations. However, if we remember that pieces do not move independently but affect other pieces, the orientation of our final corner is going to be determined by the orientation of the other corners. Thus, we actually end up with 3^7 orientation combinations, or 2,187.</p>
<blockquote>
<p>Combinations = 8! * 3^7 …</p>
</blockquote>
<p>There are 12 edge pieces, so it is tempting to say there are 12! ways to arrange them. However, each move on the cube exchanges two pieces between layers. Single face moves always swap a corner-edge pair from one layer to another, but each layer can only contain 8 pieces (centers excluded), those extra two pieces have to go somewhere else. We’re always moving an even number of pieces (or, the cube always has even parity), and there is no way for us to simply swap edge pieces. Not only do the edge positions depend on the placement of edges before them, but they also depend on the placement of the corners. We end up with 12! / 2, or 239,500,800 edge arrangements.</p>
<blockquote>
<p>Combinations = 8! * 3^7 * (12! / 2) …</p>
</blockquote>
<p>Each of the 12 edge pieces has two possible orientations, which would be 2^12 possible orientation combinations. However, like the corner orientations, the last one depends on the orientation of the others, so we really end up with 2^11, or 2,048 combinations.</p>
<blockquote>
<p>Combinations = 8! * 3^7 * (12! / 2) * 2^11</p>
</blockquote>
<p>Let’s resolve this:</p>
<blockquote>
<p>Combinations = 40320 * 2187 * 239500800 * 2048<br />
Combinations = 43,252,003,274,489,856,000</p>
</blockquote>
<p>That’s over 43 quintillion! Three billion fits into this number over 14 billion times! Talk about an underestimate!</p>You’ve probably played with a ‘15 puzzle’ or ‘8 puzzle’ sliding puzzle before, and likely even solved it. I’ve owned many cheap plastic 8 puzzles over the years, since they were a staple of birthday party goodie-bags. An 8-year-old can figure out the 8 puzzle within a couple of minutes. If you’re not familiar, sliding puzzles involve sliding numbered tiles around a board until the numbers are in order (top to bottom, left to right), with the blank space in the bottom-right corner.A Cartesian Celestial Coordinate Function in Python2019-06-26T20:27:00-07:002019-06-26T20:27:00-07:00/python/astronomy/2019/06/26/cartesian-coordinate-function<p>As an addendum to my recent post on <a href="/astronomy/2019/06/22/converting-equatorial-to-cartesian.html">converting equatorial celestial coordinates to Cartesian coordinates</a> (x, y, z), I wrote a small Python function to demonstrate. It takes right ascension in hours, minutes, and seconds; declination in degrees, hours, and seconds; and distance. It returns x y z coordinates as a tuple (but it would be easy to turn into another data type).</p>
<figure class="highlight">
<pre><code class="language-python" data-lang="python"><span class="kn">import</span> <span class="nn">math</span>
<span class="c1"># Convert sexigesimal degrees to decimal degrees
</span><span class="k">def</span> <span class="nf">sexigesimalToDecimalDegrees</span><span class="p">(</span><span class="n">degrees</span><span class="p">,</span> <span class="n">minutes</span><span class="p">,</span> <span class="n">seconds</span><span class="p">):</span>
<span class="n">decimalDegrees</span> <span class="o">=</span> <span class="n">degrees</span> <span class="o">+</span> <span class="p">(</span><span class="n">minutes</span> <span class="o">/</span> <span class="mi">60</span><span class="p">)</span> <span class="o">+</span> <span class="p">(</span><span class="n">seconds</span> <span class="o">/</span> <span class="mi">3600</span><span class="p">)</span>
<span class="k">return</span> <span class="n">decimalDegrees</span>
<span class="c1"># Converts right ascension, declination, and distance to x y z coordinates
</span><span class="k">def</span> <span class="nf">cartesianCoordinates</span><span class="p">(</span><span class="n">rhours</span><span class="p">,</span> <span class="n">rminutes</span><span class="p">,</span> <span class="n">rseconds</span><span class="p">,</span> <span class="n">ddegrees</span><span class="p">,</span> <span class="n">dminutes</span><span class="p">,</span> <span class="n">dseconds</span><span class="p">,</span> <span class="n">distance</span><span class="p">):</span>
<span class="n">alpha</span> <span class="o">=</span> <span class="n">math</span><span class="p">.</span><span class="n">radians</span><span class="p">(</span><span class="n">sexigesimalToDecimalDegrees</span><span class="p">((</span><span class="n">rhours</span> <span class="o">*</span> <span class="mi">15</span><span class="p">),</span> <span class="p">(</span><span class="n">rminutes</span> <span class="o">*</span> <span class="mi">15</span><span class="p">),</span> <span class="p">(</span><span class="n">rseconds</span> <span class="o">*</span> <span class="mi">15</span><span class="p">)))</span>
<span class="n">delta</span> <span class="o">=</span> <span class="n">math</span><span class="p">.</span><span class="n">radians</span><span class="p">(</span><span class="n">sexigesimalToDecimalDegrees</span><span class="p">(</span><span class="n">ddegrees</span><span class="p">,</span> <span class="n">dminutes</span><span class="p">,</span> <span class="n">dseconds</span><span class="p">))</span>
<span class="n">x</span> <span class="o">=</span> <span class="nb">round</span><span class="p">((</span><span class="n">distance</span> <span class="o">*</span> <span class="n">math</span><span class="p">.</span><span class="n">cos</span><span class="p">(</span><span class="n">delta</span><span class="p">)</span> <span class="o">*</span> <span class="n">math</span><span class="p">.</span><span class="n">cos</span><span class="p">(</span><span class="n">alpha</span><span class="p">)),</span> <span class="mi">3</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="nb">round</span><span class="p">((</span><span class="n">distance</span> <span class="o">*</span> <span class="n">math</span><span class="p">.</span><span class="n">cos</span><span class="p">(</span><span class="n">delta</span><span class="p">)</span> <span class="o">*</span> <span class="n">math</span><span class="p">.</span><span class="n">sin</span><span class="p">(</span><span class="n">alpha</span><span class="p">)),</span> <span class="mi">3</span><span class="p">)</span>
<span class="n">z</span> <span class="o">=</span> <span class="nb">round</span><span class="p">((</span><span class="n">distance</span> <span class="o">*</span> <span class="n">math</span><span class="p">.</span><span class="n">sin</span><span class="p">(</span><span class="n">delta</span><span class="p">)),</span> <span class="mi">3</span><span class="p">)</span>
<span class="k">return</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">z</span><span class="p">)</span></code></pre>
</figure>
<p>A quick test, again utilizing Aldebaran:</p>
<figure class="highlight">
<pre><code class="language-python" data-lang="python"><span class="c1"># Right ascension: 4h 35m 55.23907s
# Declination: 16° 30' 33.4885''
# Distance: 20.0 pc
</span><span class="n">aldebaran</span> <span class="o">=</span> <span class="n">cartesianCoordinates</span><span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mi">35</span><span class="p">,</span> <span class="mf">55.23907</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="mi">30</span><span class="p">,</span> <span class="mf">33.4885</span><span class="p">,</span> <span class="mf">20.0</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">aldebaran</span><span class="p">)</span></code></pre>
</figure>
<p>When we run the code, our Python shell spits out</p>
<blockquote>
<p>(6.878, 17.899, 5.683)</p>
</blockquote>
<p>Next we’ll need to write a program to utilize this function. I have two ideas in mind: a script to parse and reformat a CSV or Excel doc containing star data, and a webapp to allow anyone to convert the coordinates of a star they’re curious about for… fun. But I’ll leave that for another time.</p>As an addendum to my recent post on converting equatorial celestial coordinates to Cartesian coordinates (x, y, z), I wrote a small Python function to demonstrate. It takes right ascension in hours, minutes, and seconds; declination in degrees, hours, and seconds; and distance. It returns x y z coordinates as a tuple (but it would be easy to turn into another data type).Converting Equatorial Celestial Coordinates to a Cartesian System2019-06-22T10:48:54-07:002019-06-22T10:48:54-07:00/astronomy/2019/06/22/converting-equatorial-to-cartesian<p>Celestial coordinates are usually given in an equatorial coordinates system, utilizing right ascension and declination, or their location relative to Earth. This is useful for finding stars in the night sky on Earth, but not from other locations. What if we were on a planet orbiting Aldebaran, and needed to find a particular star from this different vantage point? Or wanted to find the distance between two stars, rather than the distance of each star from Earth? Cartesian coordinates (x, y, z) are much more flexible for this purpose.</p>
<p>It’s pretty easy to convert in and out of Cartesian coordinates, meaning we can use this system to coordinate different systems. Another advantage is we can choose a different origin point by subtracting the coordinates of the star you wish to use as origin from each other star’s coordinates. If we want to model the night sky from a planet in a different star system we’ll have an easier time with universal Cartesian coordinates than with angles from the perspective of Earth.</p>
<blockquote>
<p><img src="/assets/descartes.jpg" alt="Rene Descartes" title="Rene Descartes" />
Rene Descartes, who invented Cartesian coordinates in 1637.</p>
</blockquote>
<p>We can calculate Cartesian coordinates as long as we have right ascension (α), declination (δ), and distance. Since celestial objects are in motion, it’s also important to use a time reference (epoch). Presently, the equatorial coordinates of celestial objects are usually given according to the J2000.0 epoch, which is January 1st 2000 12:00 Terrestrial Time (about 11:59 UTC). Right ascension is usually given in hour angles, declination is usually given in sexigesimal degrees (base-60, using minutes and seconds), and distance is usually given in parsecs (pc) or light-years (ly). For instance, Aldebaran has a right ascension of 04h 35m 55.23907s, a declination of +16° 30’ 33.4885’’, and a distance of about 20.0 parsecs. We’ll need to know how to convert these measurements into a form we can use, but I’ll get into that later.</p>
<p>First, we’ll define our axes. Like in an equatorial system, we set the origin point to the centre of the Earth, and our xy-plane will be the celestial plane (an imaginary plane emanating from the Earth’s equator, used as the fundamental plane in the equatorial system).</p>
<ul>
<li>
<p>The X-axis points toward δ = 0 degrees, α = 0.0 hours. This is the vernal (March) equinox point. It points toward the sun at the time it appears to cross the celestial equator northward. The vernal point is also the origin of right ascension in the equatorial system.</p>
</li>
<li>
<p>The Y-axis points toward δ = 0 degrees, α = 6.0 hours. This is 90 degrees east of X (6 hours * 15 = 90 degrees).</p>
</li>
<li>
<p>The Z-axis points toward δ = +90 degrees, which is along the Earth’s north polar axis.</p>
</li>
</ul>
<p>Now, we need to change our measurements into a form we can use. Hour angles and sexigesimal degrees (in minutes and seconds) need to be converted to decimal degrees or radians. If you’re using Excel or Python to do your math, you’ll need radians. Converting right ascension angular hours, minutes, and seconds into sexigesimal degrees, minutes and seconds is relatively simple: we just multiply each number by 15.</p>
<blockquote>
<p><img src="/assets/aldebaran.jpg" alt="Aldebaran" />
Aldebaran, a red giant star relatively close to Earth. It is one of the brightest stars in the night sky, and there is evidence it hosts a planetary system of its own. I use it as an example because it’s the first star I thought of, thanks to Captain Picard’s experience with Aldebaran whisky.</p>
</blockquote>
<p>For example, Aldebaran’s right ascension of 04h 35m 55.23907s can be expressed in sexigesimal degrees as follows:</p>
<blockquote>
<p>4 * 15 = 60<br />
35 * 15 = 525<br />
55.23907 * 15 = 828.58605</p>
</blockquote>
<blockquote>
<p>60° 525’ 828.58605’’</p>
</blockquote>
<p>In the sexigesimal system, 1° = 60’ = 3600’’, or 1” = (1/3600°). If we want we can make things a little more readable.</p>
<blockquote>
<p>60° 525’ 828.58605’’ becomes 68° 58’ 48.58605’’</p>
</blockquote>
<p>But this won’t matter once we convert further, since we’ll be summing the components anyway. Our formula to convert sexigesimal degrees to decimal degrees is:</p>
<blockquote>
<p>D = d + (m / 60) + (s / 3600)</p>
</blockquote>
<p>Where D is decimal degrees, d is sexigesimal degrees (e.g. 60°), m is minutes (e.g. 525’), and s is seconds (e.g. 828.58605’’). If we plug in our values for Aldebaran, the equation looks like</p>
<blockquote>
<p>D = 60 + (525 / 60) + (828.58605 / 3600)</p>
</blockquote>
<p>and resolves to 68.9801627917 degrees. We need to do the same conversion with declination (Aldebaran’s declination is +16° 30’ 33.4885’’ which converts to 16.5093023611 degrees). We can convert decimal degrees to radians with this formula:</p>
<blockquote>
<p>Radians = degrees * (pi / 180)</p>
</blockquote>
<p>For Aldebaran, right ascension converts to 1.2039309593 radians, and declination to 0.2881416834 radians. As alluded to earlier, Excel sine and cosine functions expect you to pass them a radian (you can convert degrees to radians with the RADIANS() function). This is also the case in many programming languages. Python’s math.cos() and math.sin() functions expect a radian (and Python will convert degrees to radians for us with the math.radians() function!). If we want to automate our Equatorial-to-Cartesian conversion, we can’t leave out this step.</p>
<p>The last element we need is distance. Thankfully, celestial distance is almost always expressed in parsecs (pc) or light-years (ly). We can use either of these units. If you use light-years, the unit of your Cartesian coordinates will be light-years; if you use parsecs, the unit of your coordinates will be parsecs. If you want to chart more than one celestial object, make sure you’re using one unit or the other. It’s easy to convert between the two.</p>
<blockquote>
<p>1 pc = 3.261563777 ly</p>
</blockquote>
<p>So to turn parsecs into light-years we multiply our parsecs by 3.261563777, and to turn light-years into parsecs we divide light-years by the same number. Parsec is usually the preferred distance unit among astronomers. If by chance you have distance in another unit:</p>
<blockquote>
<p>1pc = 206264.8 AU = 30,856,775,814,913,673 metres</p>
</blockquote>
<p>With our right ascension (α) and declination (δ) in radians, and our distance (d) in parsecs (or light-years), we can find our coordinates in x, y, z.</p>
<blockquote>
<p>x = d * cos(δ) * cos(α)</p>
</blockquote>
<blockquote>
<p>y = d * cos(δ) * sin(α)</p>
</blockquote>
<blockquote>
<p>z = d * sin(δ)</p>
</blockquote>
<p>To demonstrate, let’s return to our good friend Aldebaran! As we found earlier, Aldebaran has a right ascension of 1.2039309593 radians and a declination of 0.2881416834 radians. Its distance from Earth is about 20.0 pc. Let’s plug in these values.</p>
<p>For x:</p>
<blockquote>
<p>x = 20.0 * cos(0.2881416834) * cos(1.2039309593)<br />
x = 20.0 * 0.9587736104 * 0.3586911565<br />
x = 6.878 pc</p>
</blockquote>
<p>For y:</p>
<blockquote>
<p>y = 20.0 * cos(0.2881416834) * sin(1.2039309593)<br />
y = 20.0 * 0.9587736104 * 0.9334562948<br />
y = 17.899 pc</p>
</blockquote>
<p>For z:</p>
<blockquote>
<p>z = 20.0 * sin(0.2881416834)<br />
z = 20.0 * 0.2841710119<br />
z = 5.683 pc</p>
</blockquote>
<p>And there we have it! In sum, our new coordinate is (6.878, 17.899, 5.683), measured in parsecs. Of course, we can find the coordinates in light-years by multiplying each value by 3.261563777 (and probably rounding to a few decimal places). It would not be terribly difficult to iterate over a bunch of star systems and plot the coordinates into a three-dimensional star map.</p>
<p>A few more things worth mentioning:</p>
<p>On this scale, placing our origin point in the centre of Earth (versus the Sun) doesn’t really matter, since distances inside our solar system are so small in comparison to the distances between stars. Conventionally, the center of the Earth is used, since Earth is our usual point of reference. We could use the center of the Sun as our origin instead, and define -X as pointing to Earth, and the coordinates we produce would be virtually identical (assuming we had equatorial coordinates from the Sun, otherwise they would be absolutely identical but systematically biased from origin). The point is, we can say the Sun is our origin and it wouldn’t make a difference, so feel free to mark your origin as the Sun, Sol, or whathaveyou on any star map you create.</p>
<p>To put this in perspective, the distance between the Earth and the Sun is about 149,597,870.7 kilometres, or 1 AU. The distance between Earth and the nearest other star (Proxima Centauri) is 1.3012 pc, or about 268,392 AU, or 4.015*10^13 km. That’s a proportion of about 0.00000373! A specific origin only matters if we need a very high degree of precision!</p>
<p>That said, celestial bodies are constantly drifting (including Earth), and we’re usually looking at where an object was rather than currently is because the light takes years to reach us. The reason we use a standard reference time (J2000.0) is because we’re plotting moving bodies in reference to a system (Sun-Earth) that is also in motion with fluctuations in that motion. To really see the sky from another star system, we’d need to figure out the actual location of our objects using their velocity, accounting for the time it takes the light to reach us, and the time it takes the light to reach our new origin point. These Cartesian coordinates serve only as a reasonable approximation of location. Don’t use these coordinates to plan a trip to a nearby star system or you probably won’t get where you meant to go. These coordinates ultimately tell you where stars appeared to be, relative to Earth, at a specific date and time, but with some more information we might be able to make it more accurate to ‘true’ location.</p>Celestial coordinates are usually given in an equatorial coordinates system, utilizing right ascension and declination, or their location relative to Earth. This is useful for finding stars in the night sky on Earth, but not from other locations. What if we were on a planet orbiting Aldebaran, and needed to find a particular star from this different vantage point? Or wanted to find the distance between two stars, rather than the distance of each star from Earth? Cartesian coordinates (x, y, z) are much more flexible for this purpose.Hello World2019-06-19T15:48:54-07:002019-06-19T15:48:54-07:00/python/2019/06/19/hello-world<p>This is the first post. I needed something to fill this space.</p>
<p>Here I am testing the code snippet functionality with a very simple Python ‘Hello World’ program:</p>
<figure class="highlight">
<pre><code class="language-python" data-lang="python"><span class="k">print</span><span class="p">(</span><span class="s">'Hello world!'</span><span class="p">)</span>
<span class="c1">#=> prints 'Hello world!'.</span></code></pre>
</figure>
<p>Or a little more complicated:</p>
<figure class="highlight">
<pre><code class="language-python" data-lang="python"><span class="k">def</span> <span class="nf">helloWorld</span><span class="p">():</span>
<span class="k">print</span><span class="p">(</span><span class="s">'Hello world!'</span><span class="p">)</span>
<span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">):</span>
<span class="n">helloWorld</span><span class="p">()</span>
<span class="c1">#=> prints 'Hello world!'.</span></code></pre>
</figure>
<p>Nice.</p>This is the first post. I needed something to fill this space.