The Problem

This week’s Classic asks:

“Graydon is about to depart on a boating expedition that seeks to catch footage of the rare aquatic creature, F. Riddlerius. Every day he is away, he will send a hand-written letter to his new best friend, David Hacker.2 But if Graydon still has not spotted the creature after N days (where N is some very, very large number), he will return home.

Knowing the value of N, Graydon confides to David there is only a 50 percent chance of the expedition ending in success before the N days have passed. But as soon as any footage is collected, he will immediately return home (after sending a letter that day, of course).

On average, for what fraction of the N days should David expect to receive a letter?”

The Approach

We will solve the problem in a more general form by setting the 50% value as a parameter:

P_s = probability\ of\ success


Note: Corrected after seeing frompother solutions that part of the probability was left out!

First we observe that given a uniform daily probability p of observing the creature the probability of failure of the expedition after N days is:

P_f = 1 - P_s = (1-p)^N

The expected fraction of days a letter is sent will be the expected value of the number of days letters are sent divided by N. The probability of i < N letters being sent is the probability of i-1 days of non-discovery times the probability of discovery on the ith day. We then need to add the probability that the creature is not discovered at all:

E = \dfrac{1}{N} \sum_{i=1}^{i=N} (1-p)^{i-1} \cdot p \cdot i + (1-p)^N

With some work we can evaluate this sum. One trick to notice is that a sum like this is essentially the derivative of a standard geometric series with an extra term. With a fair bit of algebra and re-arrangements we find:

E = \dfrac {1- (1-p)^N}{N \cdot p}

Substituting our value for 1- Ps above and simplifying:

E =  \dfrac{P_s}{N \cdot p}

For large N and small p we can use the following approximation:

(1-p)^N = 1 - P_s

N\cdot p \approxeq -ln(1-P_s)

And our final expression for the expected fraction of N days that a letter is received is:

E \approxeq  \dfrac{Ps}{ln(1/(1-P_s))}

For our specific case of Ps = 50% the expectation is:

E \approxeq \dfrac{1}{2 \cdot ln(2)}  \approxeq .7213

We graph the expected percentage of days with letters versus probability of expedition success:

We can also look at the case where Ps approaches one and we can see that E will slowly approach zero. In this case we know the unconstrained expected number of letters will be 1/p, the percentage will be 1/N*p and 1/N*p goes as -1/ln(1-Ps) which goes to zero as Ps goes to 1. However it goes to zero very slowly:


Stay in Your Lane

This week’s Classic asks us to consider an extension to a traffic Riddler problem from the way-back times. The original solution found that the expected number of groups that would form with N cars would be Eg(N) = H(N), where H is the harmonic function.

Here we add an exit to another lane with a one-time only access option, and the problem asks for the expected number of traffic groups in equilibrium in both lanes after the drivers decide to switch or not.

We make the following claims:

  • If there are g groups in the first lane, then after the switching there will still be g groups since there is no reason for the lead drivers in the original groups to change anything. They are driving as fast as they can. Therefore the expected number of groups in the first lane is Eg1(N)= H(N) both before and after switching.
  • All other drivers except for the lead drivers will switch lanes. They are all speed constrained and will do better by switching, with the worst case being they will end up being stuck behind a faster driver.
  • If there were N cars and g groups in lane 1 before switching, then after switching there will be g cars and g groups in the first lane and N-g cars in the second lane.

For interest (not necessary to solve the problem) we will calculate a pdf for the number of groups in the first lane. This can be tricky, but we notice that for a particular ordering of the speeds of the cars like [1,3,4,2,5,6] (where 1 is fastest) that the number of groups is equal to the number of dips to a lower speed in the sequence +1. The groups formed there are { [1], [3], [4,2], [5], [6] } where each time the velocity dips a new group is formed. If we define C(N,k) as the number of ways to form k groups delimited by dips out of the N! permutations of N ordered numbers there is a somewhat well known recursive result that is very helpful:

C(N,k) = C(N-1,k-1) + (N-1)\cdot C(N-1,k)

[You can convince yourself of the above result by writing out the possibilities for N=2,3,4 by hand and seeing how it works! For an explanation involving the original Riddler problem see here.]

Bootstrapping up from C(2,1) = 1 and C(2,2) = 1 we can rapidly calculate the possible ways to make groups of a specific size, Here is a view of C(N,k) for the first few rows of N, and columns of k:

Note that the total number of ways to form groups in each row is N! and so we calculate the function pdf(N,k) by dividing each row by N!:

Here are pdfs of expected number of groups in lane 1 for n=10, 50, 100:

The expected number of groups in both the first and second lanes is (using an elegant result from Josh Silverman here ) :

Eg1(N) + Eg2(N) = \sum_{i=1}^{N} \frac{1}{i} \sum_{j=0}^{i-1} \frac{1}{j!}

And as mentioned above Eg1(N) = H(N) as all the intial lead drivers stay in lane 1 so:

Eg2(N) = \sum_{i=1}^{N} \frac{1}{i} \sum_{j=0}^{i-1} \frac{1}{j!} - H(N)

Here is a graph of Eg1(N) and Eg2(N) for N= [2,100]. As Josh shows in his post eventually the total number of groups in lanes 1 and 2 will approach e*H(N) for large N, with a limiting expectation of H(N) for lane 1 and (e-1)*H(N) for lane 2.

As mentioned above all the initial group leaders all stay in lane 1 (according to the rules of the problem they only switch lanes if they can increase their speed) and all the rest of the cars move to lane 2:

Stay in Your Lane

Bowler Problems

This week’s problem of calculating the probability of the last pin being knocked down in a rhombus of bowling pins is difficult or impossible to calculate exactly without simulation for large N. I started by using a simplified probability model which turns out to have a lot of the same features as the correct simulated calculation and turned out to be what looks like an an upper bound.

We start by calculating a 2D grid with valid entries where the N^2 pins are located and each entry represents the probability of a pin falling down. The combined probability P of a pin being knocked down by either of its one or two predecessors, conditional on either of them having fallen is:

P(pin falling) = 1 – [ (1 – p*P(pin 1 above falling)) * (1 – p*P(pin 2 above falling)) ]

If pin 2 does not exist, as in the edge cases, P(pin 2 above falling) = 0 and the calculation can be identical for all the valid pin entries in a square grid, with entries without pins set to 0. This calculation is not exactly correct as the probabilities of the nodes in the 2D multiply connected grid are not really independent.

Propagating these probabilities for n=5, with the first pin going down with certainty, we see some interesting behavior already:

There are a few interesting things about the above conditional probability calculation:

  • At probabilities around .5 and below it appears that the last pin will show a rapidly decreasing probability of being knocked down.
  • It is possible for a pin to have a greater probability of being knocked down than its immediate two predecessors.
  • There is a condition where the probability of a pin falling equals the two pin probabilities above it and the grid will tend to “lock-up” at a constant probability in the middle. This condition can be calculated by setting all P’s in the first equation above equal, and is:

P = (2p-1)/ p^2

By calculating much larger grids and observing the results it appears that for large N and p > .5 this happens to all grids in the center region as the edge probabilities become irrelevant, and this probability ends up being the probability of the last pin falling down. For p <=.5 the probability of the last pin falling down appears to go to zero for large N.

Below are the last pin falling probabilities estimated with the stability condition and a grid calculation at N= 500. Again, remember these are not exact results but probably more like upper bounds:

p(pin falls)(2p-1)/p^2P(last pin falls): Grid N = 500

Focusing in on the critical p=.50 at larger N:

p(pin falls)(1-2p)/p^2P(last pin falls): Grid N= 5000P(last pin falls): Grid N = 10000

And we note that crucially the p = .500 values are decreasing with increasing N, while values above p =.500 are still stabilizing and actually increasing at this stage. So this imperfect upper bound calculation shows a sharp transition point at .p= .500.

To get exact values we will need to run a simulation. Note we will need 2 random numbers per pin to account for the separate probabilities of striking the two pins underneath.

The simulation behaves a bit differently than our upper-bound calculation and it appears to show a transition point appears at around .600 or so (this sim is N=50 at 1000 reps):

Zooming in:

And here are results with N=100 and 5,000 reps:

p(pin falls)P(last pin falls): Simulation Grid, N= 100, 5,000 Reps

It is difficult to know how stable these results will be with large N and e.g. how sharp the stability cut-off is. The assumption is it gets sharper as N gets larger judging by our simplified probability model results.

Finally, I ran one simulation for p = .700 at N=200 and 10,000 reps and got a result of P= .579 , so I guess that is the end of it for now and we decide that Fosse will win this Battle of the Bowlers 57.9% of the time with the remaining 42.1% being draws. Probably.

Code for the probability calculation:

Somewhat inefficient code for the simulation:

Bowler Problems

Cherchez Le Électron

The following results were obtained by manually seeding starting positions into an optimizer, then manually fine tuning the optimizer results to account for obvious symmetries and to reduce optimizer noise. A useful strategy for the asymmetric cases in this problem is to start with a plausible solution centered on the origin and then move the entire group of cups around to maximize the total probability covered.

Note that n=5 and n=8 are offset from center, moving the asymmetrical perimeter cups towards the peak of the probability distribution to maximize total probability. With n >7 it gets harder to increase probability significantly and it seems likely that the optimal cup locations will tile the plane in center-hexagonal formation with increasingly small offsets from center, with offsets approaching zero as n becomes large.

Note: I have updated my answer for n=6 after seeing better answers from Tom Keith and Jenny Mitchell – thanks!

Cherchez Le Électron

Socially Distant Goats

Polite, yet demanding, our goats will wait their turn.


This week’s Classic from Quoc Tran asks:

“A goat tower has 10 floors, each of which can accommodate a single goat. Ten goats approach the tower, and each goat has its own (random) preference of floor. Multiple goats can prefer the same floor.

One by one, each goat walks up the tower to its preferred room. If the floor is empty, the goat will make itself at home. But if the floor is already occupied by another goat, then it will keep going up until it finds the next empty floor, which it will occupy. But if it does not find any empty floors, the goat will be stuck on the roof of the tower.

What is the probability that all 10 goats will have their own floor, meaning no goat is left stranded on the roof of the tower?”

After considering the problem there a few observations:

  • The order in which goats choose floors will not change the overall outcome of the result (whether all the goats do or do not get a floor.)
  • The condition that all goats find a floor is equivalent to the condition that in a sorted list of goat floor choices, each entry in the list is <= to it’s position in the list. For example: [0,1,2,2,3,7,7,7,8,9] is a valid sorted selection of random goat floor choices, with positions labeled 0-9, and [1,2,2,3,4,5,6,7,8,9] is not. We will use this as a fast way to filter out possible sorted solutions.

We will proceed this way and compute the number of valid possible random goat floor choices. One issue is the entire space of choices is very large = 10^10. We will reduce this by using our sorting observation and calculate all the sorted combinations with repetitions (multiple goats can choose the same floor), look for valid combinations, then calculate the number of permutations (not the permutations themselves) for each valid combination to arrive at the correct total probability. The number of unique sorted combinations with repetitions for n=10 is \binom {19}{10} or only 92,738. This calculation only takes a few seconds, and we calculate some of the program variables for n=2-12.

Note that the percentage of valid permutations of sorted combinations is much higher than the percentage of valid sorted combinations alone. I was unable to get the analytical form for the solution, but was able to visually pick out some functional forms in the table, which upon doing some (OEIS!) research we see are well known in the combinatoric literature of “parking functions.”


Socially Distant Goats

Combinatoric Derivatives

This week’s Riddler Classic asks:

“You have an urn with an equal number of red balls and white balls, but you have no information about what that number might be. You draw 19 balls at random, without replacement, and you get eight red balls and 11 white balls. What is your best guess for the original number of balls (red and white) in the urn?”

This is standard combinatorics problem with the probability of the draw conditional on n total balls, and r red and w white, being:

p(n,r,w) = \dfrac {\binom{n/2}{r}\cdot\binom{n/2}{w}}{\binom{n}{r+w}}

So here the maximum likelihood is the value of n that maximizes this probability given r=8 and w = 11, where n has to be an even integer, with a minimum possible value of 22.

So we can just successively calculate the probabilities starting at n = 22 and we find a maximum at n= 34, with p = 16.21%.

But is there a way we can try to maximize this expression without a trial and error approach? This might be useful for very large n for example. There is at least an approximate approach.

First we will rewrite this expression as:

p = \dfrac {(r+w)!}{r!\cdot w!} \cdot  \frac {\displaystyle \prod_{n/2-r+1}^{n/2} \cdot \prod_{n/2-w+1}^{n/2}} {\displaystyle \prod_{n-r-w+1}^{n}}

Even though n, r and w are integers, this function is pretty smooth. So we will take the approach of looking for a value of n that makes p stationary under small changes of n. Here a small change will be a change in n of – 2, and a change in n/2 of -1. We will then look for values of n such that p’ – p = 0.

p'-p =  \frac {\displaystyle \prod_{n/2-r}^{n/2-1} \cdot \prod_{n/2-w}^{n/2-1}} {\displaystyle \prod_{n-r-w-1}^{n-2}} - \frac {\displaystyle \prod_{n/2-r+1}^{n/2} \cdot \prod_{n/2-w+1}^{n/2}} {\displaystyle \prod_{n-r-w+1}^{n}}  = 0

Re-arranging terms and cross-multiplying we find:

p'-p =  \dfrac {(n/2) \cdot (n/2)} {(n/2-r) \cdot (n/2-w)} = \dfrac {n*(n-1)}{(n-r-w-1)\cdot(n-r-w)}

Cross multiplying again, collecting terms and simplifying we obtain the following expression for n:

n = \dfrac {4r \cdot w}{r + w - (r-w)^2}

Plugging in r= 8 and w = 11 we get n = 35.2 which is pretty close to the correct answer, 34. The way we have set up the p'(n)-p(n-2) = 0 condition, the calculated values of p’ and p will likely straddle the correct maximum. Therefore the exact answer for n appears to be the greatest even integer <= the calculated value of n.

Note: it is not clear that there are maximum values of p for all combinations of r and w. If we look at the denominator of our expression it appears that for values that result in a denominator value <= 0 the problem may not have a finite maximum likelihood value of n.

Here are some plots of red/white combinations that meet the above denominator condition for maximum likelihood values corresponding to a finite n:

Combinatoric Derivatives

Ergodic Grasshopper II

Here is a proof that the ratio of the probability density in the middle of the beam must be equal to 2x the probability density at the end of the beam assuming a trapezoidal from of the pdf and the particular “soft reflection” jump pdf near the ends of the beam.

We first note that the steady state condition implies zero net probability flux into and out of any region on the beam. We examine the flux at an infinitesimal region dx’ from the left end of the beam. The jump pdf half width is w. The grasshopper’s steady state probability density at the end of the beam is p(e) and in the middle region it is p(m).

\text{Probability flux out} = p(e) * dx'

\text{Probability flux in} = \int_{0}^{w} [p(e) +(p(m)-p(e))\cdot \dfrac{x}{w} ]\cdot \dfrac{dx'}{w+x} \,dx \

Evaluating the second integral and setting in and out flux equal:

\big|_{0}^{w} [ (2\cdot p(e)-p(m)) \cdot log(x+w) + ((p(m)-p(e))\cdot \dfrac{x}{w}] \cdot dx' = p(e) \cdot dx'

Simplifying and eliminating dx’^2 terms:

[(2\cdot p(e)-p(m)) \cdot log(2) + p(m)-p(e)] \cdot dx' = p(e) \cdot dx'

And we see that the probability flux conservation condition is met when:

p(m) = 2 \cdot p(e)
Ergodic Grasshopper II

Ergodic Grasshopper

This week’s Riddler poses a question about a grasshopper on a balance beam:

You are trying to catch a grasshopper on a balance beam that is 1 meter long. Every time you try to catch it, it jumps to a random point along the interval between 20 centimeters left of its current position and 20 centimeters right of its current position.

If the grasshopper is within 20 centimeters of one of the edges, it will not jump off the edge. For example, if it is 10 centimeters from the left edge of the beam, then it will randomly jump to anywhere within 30 centimeters of that edge with equal probability (meaning it will be twice as likely to jump right as it is to jump left).

After many, many failed attempts to catch the grasshopper, where is it most likely to be on the beam? Where is it least likely? And what is the ratio between these respective probabilities?

We observe that the probability of the grasshopper being at any specific position on the beam like the exact middle or the exact end is zero with a continuous distribution. So, we proceed by coarse graining the problem a bit and limit the valid locations on the balance beam to n points, calculate transition probabilities from point to point, raise the the transition matrix to a suitably high power, look at the results, and then let the distance between points go to zero. We will see that the distance between points actually doesn’t really matter within reason in this problem!

We will start by locating 21 points on the beam and calculate a transition probability matrix T (see Python code below):

The system essentially has a “soft reflection” when the boundaries of the probability interval move past the end of the beam, which is kind of interesting.

To find the longer term behavior of this system, we follow Markov chain theory and raise the T matrix to the arbitrary power of 100, at which point it should settle to a steady state if there is one:

We see a good result – all the rows are identical. This means that the intial location of the grasshopper, or probability distribution of the initial location, does not matter at all, and that each row describes the long term probability of the grasshopper’s location. Note: these row values can also be found by solving the linear equation x*T =x with constraint that sum(x) = 1, which is the equivalent of calculating the left eigenvector of the transition matrix with eigenvalue = 1 and normalizing it so that total probability = 1.

The probability distribution of the location of the grasshopper after many jumps (>=100) looks like:

And the ratio of the probability of the middle points to the end points is exactly 2.0.

If we change the code parameters to reduce the point spacing to 1 or .1 or .01, it does not change the probability ratio of 2. Also, the width of the hop distribution (20 in the problem) does not affect the ratio either for plausible values.


Ergodic Grasshopper

Dicey Business

This week’s Riddler asks us to pick 4 numbers such that we have the best chance of equaling at least one of the pairs of sums produced by a roll of 4 dice:

“We’re playing a game where you have to pick four whole numbers. Then I will roll four fair dice. If any two of the dice add up to any one of the numbers you picked, then you win! Otherwise, you lose.”

We look at a roll of 4 dice and observe the following:

  • There are 6 possible pairs of sums created by this roll of 4 dice (d1-d4) : {[d1+d2], [d1+d3], [d1+d4], [d2+d3], [d2+d4], [d3+d4]}
  • Suppose we want to have picks that cover all the possible rolls (ie win percentage = 100%). We certainly need to at least cover the rolls [1,1,1,1], [2,2,2,2], [3,3,3,3], [4,4,4,4], [5,5,5,5], [6,6,6,6]. To do so requires us to have bets on [2,4,6,8,10,12].
  • We also see that since there are 4 dice, there is no possibility that the 6 pairs of sums for any roll of 4 will not include an even number. And since [2,4,6,8,10,12] are all the even numbers possible we conclude that if we had 6 picks of numbers we would pick [2,4,6,8,10,12] and we would win 100% of the time.
  • But we only have 4 picks. What to do? Well we can look at the relative frequency of the different sums given one pair of dice:

So since 2 and 12 are relatively rare we might guess that a good answer for 4 picks would be [4,6,8,10]. Is it? We write a little Python code and vary our number of guesses from 1-6 and find the following:

Best 1 guesses: [7], Percent wins: 64.35
Best 2 guesses: [6, 7], [7, 8], Percent wins: 83.56
Best 3 guesses: [6, 7, 8], Percent wins: 91.98
Best 4 guesses: [4, 6, 8, 10], Percent wins: 97.53
Best 5 guesses: [2, 4, 6, 8, 10], [4, 6, 8, 10, 12] Percent wins: 99.00
Best 6 guesses: [2, 4, 6, 8, 10, 12], Percent wins: 100.00

So our intuition is correct.


Dicey Business

Cone Trip

This week’s Riddler Classic asks us to find the shortest trip on the base and sides of a cone :

“Two ants named Geo and Desik are racing along the surface of a cone. The circular base of the cone has a radius of 2 meters and a slant height of 4 meters. Geo and Desik both start the race on the base, a distance of 1 meter away from its center.

The race’s finish is halfway up the cone, 90 degrees around the cone’s central axis from the start, as shown in the following diagram:”

We will redraw the cone in two separate pieces: a top view of the cone and base and an unwrapped view of the sides of the cone:

Given the dimensions of this cone it unwraps into a perfect semi-circle with radius 4. Each point on the perimeter of the semi-circle maps to a point on the perimeter on the circular base, with a central angle (theta/2) one-half that of that of the cicular base (theta).

We see that the trip around the cone has 2 pieces: first a traverse along the base to the “edge point” at the perimeter of the base, and then a trip up the side of the cone to the “finish” point. The total distance is the sum of these distances and we will calculate the first distance using the circular base drawing and the second using the semicircular drawing of the unwrapped sides:

Using the law of cosines we can see that the individual distances are:

d1  = [5 - 4*cos(\theta)]^{1/2}
d2' = [20 - 16*cos(\pi/4 -\theta/2)]^{1/2}

d= d1+d2' = [5 - 4*cos(\theta)]^{1/2} + 2*[5 - 4*cos(\pi/4-\theta/2)]^{1/2}

Taking the derivative with respect to theta, setting to zero, and re-organizing we see:

\cfrac{sin(\theta)}{[5-4*cos(\theta)]^{1/2}} = \cfrac{sin(\pi/4-\theta/2)}{[5-4*cos(\pi/4-\theta/2)]^{1/2}}

\theta = \pi/4-\theta/2

\theta = \pi/6 =  30^{\circ}

So we see that the minimum trip distance is:

dmin = 3*[5-2*\sqrt{3}]^{1/2} \approx 3.717941 m

An interesting note is that exactly 1/3 of the distance is traveled on the base, and 2/3 on the side of the cone.

Cone Trip