I’ve been reading ET Jaynes’ book on probability theory and encountered an interesting little result.

Consider the classic urn model where we are presented with a bucket containing 4 balls (2 red and 2 white) and we draw from this urn without replacement. Denote to be the event that a red ball is picked from the urn on the draw. So would be interpreted as the probability of picking a red ball on the first draw.

Now consider the following probabilities and . For the second probability, we consider the case where the red ball is picked on the 2nd or 3rd draw (possibly both). Which probability do you suppose is greater? Think about this question for a second before continuing reading.

Let’s imagine that the sampling experiment has already taken place and you selected all the balls from the urn blindly. I tell you that the 2nd ball you selected was red. What do you suppose are the chances the first ball you drew was also red? This is easy to calculate…

This result easily matches our intuition. If 1 of the 3 remaining balls that is unaccounted for is red, we should expect to have a 1 in 3 chance of having selected it first.

Now consider the case where I tell you that either the 2nd or 3rd ball (possibly both) are red. Let’s calculate the chance that the first ball selected is also red…

Therefore . This result clashes with my own intuition. I would expect that knowledge of a red ball picked on the 2nd or 3rd draw would reduce my chances of having picked a red ball on the 1st draw, especially since includes the possibility of both red balls being selected on the 2nd and 3rd draws. How is it that the knowledge of a red ball possibly being drawn in an additional spot actually increases our odds of selecting it on the first draw?

Here is how Jaynes attempts to intuit the phenomenon: *The information reduces the number of red balls available for the first draw by one, and it reduces the number of balls in the urn available for the first draw by one, giving . The information [ ] reduces the ‘[expected] number of red balls’ available for the first draw, but it reduces the number of balls in the urn available for the first draw by two.*

So similarly to how we calculate , we should think of , where is the expected number of red balls removed when we know that the 2nd, 3rd, or both picks possibly withdrew a red ball. Let’s do that:

Thus we attain our desired result.

After encountering this result I couldn’t help but think about the famous Monty Hall problem. Both lead to seemingly contradictory results where probabilities change in seemingly counter-intuitive ways based on our knowledge. Is there really a relationship between the Monty Hall problem and the above result? Or is the connection tenuous at best?