Combinatoric Derivatives

This week’s Riddler Classic asks:

“You have an urn with an equal number of red balls and white balls, but you have no information about what that number might be. You draw 19 balls at random, without replacement, and you get eight red balls and 11 white balls. What is your best guess for the original number of balls (red and white) in the urn?”

This is standard combinatorics problem with the probability of the draw conditional on n total balls, and r red and w white, being:

p(n,r,w) = \dfrac {\binom{n/2}{r}\cdot\binom{n/2}{w}}{\binom{n}{r+w}}

So here the maximum likelihood is the value of n that maximizes this probability given r=8 and w = 11, where n has to be an even integer, with a minimum possible value of 22.

So we can just successively calculate the probabilities starting at n = 22 and we find a maximum at n= 34, with p = 16.21%.

But is there a way we can try to maximize this expression without a trial and error approach? This might be useful for very large n for example. There is at least an approximate approach.

First we will rewrite this expression as:

p = \dfrac {(r+w)!}{r!\cdot w!} \cdot  \frac {\displaystyle \prod_{n/2-r+1}^{n/2} \cdot \prod_{n/2-w+1}^{n/2}} {\displaystyle \prod_{n-r-w+1}^{n}}

Even though n, r and w are integers, this function is pretty smooth. So we will take the approach of looking for a value of n that makes p stationary under small changes of n. Here a small change will be a change in n of – 2, and a change in n/2 of -1. We will then look for values of n such that p’ – p = 0.

p'-p =  \frac {\displaystyle \prod_{n/2-r}^{n/2-1} \cdot \prod_{n/2-w}^{n/2-1}} {\displaystyle \prod_{n-r-w-1}^{n-2}} - \frac {\displaystyle \prod_{n/2-r+1}^{n/2} \cdot \prod_{n/2-w+1}^{n/2}} {\displaystyle \prod_{n-r-w+1}^{n}}  = 0

Re-arranging terms and cross-multiplying we find:

p'-p =  \dfrac {(n/2) \cdot (n/2)} {(n/2-r) \cdot (n/2-w)} = \dfrac {n*(n-1)}{(n-r-w-1)\cdot(n-r-w)}

Cross multiplying again, collecting terms and simplifying we obtain the following expression for n:

n = \dfrac {4r \cdot w}{r + w - (r-w)^2}

Plugging in r= 8 and w = 11 we get n = 35.2 which is pretty close to the correct answer, 34. The way we have set up the p'(n)-p(n-2) = 0 condition, the calculated values of p’ and p will likely straddle the correct maximum. Therefore the exact answer for n appears to be the greatest even integer <= the calculated value of n.

Note: it is not clear that there are maximum values of p for all combinations of r and w. If we look at the denominator of our expression it appears that for values that result in a denominator value <= 0 the problem may not have a finite maximum likelihood value of n.

Here are some plots of red/white combinations that meet the above denominator condition for maximum likelihood values corresponding to a finite n:

Combinatoric Derivatives

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s