HPS 0628 | Paradox | |
Back to doc list
Paradoxes From Probability Theory:
Mutual Exclusivity
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
http://www.pitt.edu/~jdnorton
For a compact reminder of probability theory, see Probability Theory Refresher.
The probability theory provides a quantitative calculus for dealing with chance and uncertainty. It is one of the most successful analytic tools available to us and is often called upon to correct misconceptions about chance. These corrections have an air of paradox to them, since they give results that are, at least initially, quite unexpected. They have a strong presence in the paradox literature. They are not paradoxes of the type that reveals a contradiction in our presupposition. Rather they are paradoxes in the sense that they present us with unexpected results. This chapter presents a brief sampling of them.
Since the paradoxes arise through corrections FROM probability theory TO common misconceptions, they belong in the "from" chapters. Subsequent chapters will investigate a reversed problem. While the probability calculus can be applied profitably to a very large array of problems, there are some circumstances in which it fails. These are presented as paradoxes FOR probability theory.
The paradoxes from probability theory can be categorized loosely into those that arise from improper assessments of mutual exclusivity, from improper assessments of probabilistic independence and from improper assessments of the import of expectations. Examples of each are given in this chapter and the following two chapters.
The commonality in these paradoxes is that they involve a failure to consider all the mutually exclusive outcomes that comprise the outcome space, in a space in which all the most specific, mutually exclusive outcomes have the same probability. Rather, two or more mutually exclusive cases are treated as one case and thus their probabilities are underestimated.
This paradox, already described in the Budget of Paradoxes involves just this mistake concerning mutually exclusive outcomes. Here is the paradox again:
At a carnival sideshow, you are invited to play the following game. There are three cards. One is blue on both sides. One is red on both sides. One is blue on one side and red on the other.
To play the game, you are allowed to shuffle and flip the cards without looking until they could be in any order with any side up.
A card is drawn and it is red. You are offered an even odds bet that the other side of the card is blue.
It seems a fair bet. The color of the other side of the card is either red or blue, so each has (it would seem) an equal chance. That is:
(??) The probability that the color of the other side of the card is blue is 1/2. (??)
This is incorrect and overestimates the chance that the other side is blue, which then inclines us to an unfavorable wager. The ease with which we fall into this error makes it easy for the swindle to occur.
The error in the analysis is that the set of mutually exclusive outcomes has not been assessed correctly. When the card is drawn, there are three mutually exclusive outcomes possible:
The card is red-blue and the red side is uppermost.
The card is red-red and side-1 is uppermost.
The card is red-red and side-2 is uppermost.
If we imagine that the two sides of the red-red card are explicitly numbered, the three possible, mutually exclusive outcomes are shown as:
The shuffling and flipping described above ensures that each of these outcomes has equal probability. In two of three outcomes, the other side of the card is red; and in only one is the other side blue. Since these three outcomes are equally probable, it follows that
The probability that the color of the other
side of the card is blue is 1/3.
Another common formulation of what is essentially the same paradox concerns a two-child family. The essential background assumptions are that:
probability of boy P(B) = probability of girl P(G) = 1/2
and that the probabilities of the gender are independent of the genders of the other children in the family.
Until we have more information that restricts the possibilities, the outcome space is comprised of four mutually exclusive outcomes:
GG,
GB, BG, BB
where "GG" = "first child is a girl; second child is a girl" etc.
This idealized set of assumptions could equally be realized with independent coin tosses.
??
We are asked two question:
"one": One of the children is girl, G. What is the probability that the other is girl, G?
"first": The first child is a girl, G. What is the probability that the second is a girl, G?
It is easy to assume that both questions have the same answer. For in both cases, the other child might be a boy, B, or a girl, G. These two cases then suggest a probability of 1/2 for each. That is:
(??) In both "one" and "first," the probability of a girl, G, is 1/2. (??)
Once again, this conclusion is mistaken. The two questions each lead to different sets of possible, mutually exclusive outcomes and thus different reduced outcome spaces.
In the case of "one," we know that one of the children is a girl. We are not told which is the girl. It could be either the first or the second child. Hence there are three remaining mutually exclusive outcomes possible:
GG,
GB, BG
Each has equal probability. In two of these outcomes, the other child is a boy; and in only one is the other child a girl. So we arrive at the correct result:
In "one," the probability of a girl, G, is 1/3.
The case of "first" leads to a different set of mutually exclusive outcomes that are still possible. They are:
GG, GB
Each has equal probability and in only one of them is the other, second child a girl. So we have a different result from "one":
In "first," the probability of a girl, G, is 1/2.
A more striking difference between questions like "one" and "first" appears if we increase the size of the family.
??????????
The revised questions for a ten-child family are:
"nine": nine of the children are girls, G. What is the probability that the other is girl, G?
"first nine": The first nine children are girls, G. What is the probability that the tenth is a girl, G?
As before, we should resist the temptation to say that in each case the probability of girl, G, is the same: 1/2. The two cases are seen to be very different if we enumerate the mutually exclusive possibilities each allows.
In the case of "nine," there are eleven mutually exclusive possibilities that form the reduced outcome space. Using the notation above, they are:
GGGGGGGGGG,
GGGGGGGGGB,
GGGGGGGGBG,
GGGGGGGBGG,
GGGGGGBGGG,
GGGGGBGGGG,
GGGGBGGGGG,
GGGBGGGGGG,
GGBGGGGGGG,
GBGGGGGGGG,
BGGGGGGGGG
In only one of these eleven mutually exclusive outcomes is the remaining child a girl. Since these are equally probable, we have:
In "nine," the probability of a girl, G, is 1/11.
The case of "first nine" is different. There are only two remaining, mutually exclusive outcomes possible:
GGGGGGGGGG,
GGGGGGGGGB
In one of these two cases, the tenth child is a girl. Since these mutually exclusive outcomes are equally probable, we have:
In "first nine," the probability of a girl, G, is 1/2.
We might try to conceive these last paradoxes as arising from errors in judgments of independence. It may seem, for example, that the outcome of nine children being girls and the outcome of the remaining child being a girl are two dependent outcomes. The error is to treat them as independent. This is a weaker diagnosis. The difficulty is that the second outcome in this analysis--the other child is a girl--is not an outcome well-defined prior to specification of the first outcome (that nine children are girls). That imprecision is the deeper cause of the problem and better remedied by a more careful description of the mutually exclusive outcomes of the reduced outcome space.
The Monty Hall problem is a curiously deceptive problem in probability theory. It is deceptive since it looks like one of the simplest and most familiar calculations in probability, conditionalization. Novices and people who are familiar with the probability theory tend to judge the problem too quickly and jump to what looks like an easy solution. While that jump works in many similar cases, this case has a subtle addition that makes the familiar move fail.
The problem had been discussed in various forms in the recreational mathematics literature for a while before it became prominent. It might never have attracted much attention until it was discussed in Marilyn vos Savant's widely syndicated "Ask Marilyn" column of Parade magazine in the 1990s. She analyzed the problem carefully and gave the correct answer. That answer disagreed with the result that comes from treating it as a simple conditionalization problem. What followed was a flood of letters incorrectly protesting her solution and even ironically lamenting the poor state of US mathematics education.
The version of the problem considered arises in a TV game show. There are three closed doors, numbered 1, 2 and 3. Behind one of them is a car and behind the other two is a goat. We are to suppose that the car and goat have been placed randomly such that that there is a probability of 1/3 that the car is behind each of the three doors. The contestant can pick one of the three doors. If the car is behind that door, the contestant wins the car. Otherwise the contestant wins nothing.
We shall assume that the contestant chooses door 1. For the rest of the analysis, it does not matter which door the contest chooses. The analysis would be the same, but with different door numbers if the contestant chose either of the other two doors instead.
The outcome space so far has three equally likely possibilities. Writing "P" for probability:
To draw out the game a little, Monty Hall offers a deal. The contestant can switch the choice to a different door. To make the decision more interesting, Monty Hall then opens one of the other two doors to reveal a goat. Let us say that it was door 3. (Again, which of door 2 or 3 is chosen will not make to a difference to the analysis.)
The outcome space is now:
The big question is: should the contestant switch
the choice from door 1 to door 2? Two answers
are commonly circulated.
First: There is no gain in switching. The
contestant has no reason to prefer either of the remaining doors 1 and 2.
They have an equal probability of 1/2.
Second: There is a gain in switching. The probability of the car behind door 1 is still 1/3, but the probability of the car behind door 2 is increased to 2/3.
Here is the mistaken but easily adopted analysis.
The intuition behind this mistaken analysis is that opening door 3 by Monty Hall only gives the contestant information about what is behind door 3. It reveals nothing that can favor either of door 1 or door 2 as the location of the hidden car. The probability that either hid the car was the same, 1/3, before Monty Hall opened door 3. It is the same after door 3 is opened. They are still equal. Since there are now only two equally probable possibilities, they each have the probability 1/2.
The outcome space is now:
This intuition can be supported by some formal analysis. We are interested in the probability that the car in behind door 1 or behind door 2, conditioning on the new knowledge gained when Monty Hall opens door 3.To compute this condition probability, we use the rule of conditionalization discussed in the Probability Refresher. It says for P(A|D), "the probability of A given that D is the case":
P(A|D) = P(A and D) / P(D)
We can see how the rule would be applied correctly in similar problem. There are three boxes A, B and C. A red marble is hidden inside one of them with equal probability 1/3 each. That is we have
Outcome A = marble in box A. P(A) = 1/3
Outcome B = marble in box B. P(B) = 1/3
Outcome C = marble in box C. P(C) = 1/3
We now learn that the marble is not in box C. That is we learn (A or B). We apply the rule of conditionalization above:
P(A|A or B) = P(A and (A or B)) / P(A or B)
= P(A) / P(A or B) = (1/3) / (2/3) = 1/2
A similar calculation gives
P(B|(A or B) = 1/2.
That is, after we learn that the marble is not in box C, will still have no reason to distinguish box A or box B as the location of the hidden marble. The two possibilities have equal probability.
The temptation is to give the
same analysis in the Monty Hall problem. We start with:
Probability(car-door-1)
= Probability(car-door-2)
=
Probability(car-door-3) = 1/3
We learn from Monty Hall opening door 3 that the car is not hidden there: "not-car-door-3" = "car-door-1 or car-door-2".
The rule of conditional probability tells us how to compute "probability that the car is behind door 1 given that the car is behind door 1 or door 2":
In the formula for conditional probability above we set:
A = car-door-1, D = car-door-1 or car-door-2, so
that
A and D = (car-door-1) and (car-door-1 or car-door-2) = (car-door-1).
The calculation is:
P(car-door-1 |car-door-1 or
car-door-2)
= P(car-door-1) / P(car-door-1 or car-door-2)
= (1/3) /
(2/3) = 1/2
A similar calculation gives
P(car-door-2 |car-door-1 or car-door-2) = 1/2
That is, there is an equal chance of 1/2 that the car is behind either door 1 or door 2. The contestant gains nothing in switching.
Appealing as this analysis may be, it is based on a false assumption:
??? Monty Hall's opening of door 3 to reveal a goat only gives information to the contestant about what is behind door 3. ???
NO!
Monty Hall's decision to open door 3 derives from his knowledge of where the car is hidden. He would not open door 3 if that is where the car is hidden. That his choice is conditioned by that knowledge provides useful information to the contestant. There are two cases:
Case door 2. The
car is hidden behind door 2. Then Monty Hall
has no choice. He must open door 3, else he gives away the location of the
car.
Case door 3. The car is hidden behind door 1. Then Monty Hall does have a choice. It is common to assume that Monty Hall chooses randomly. That is, there is a probability of 1/2 that Monty Hall opens either of doors 2 or 3.
That Monty Hall has no choice in which door to open in "case door 2" is, we shall see, what provides the contestant extra information about the location of the car. In this aspect, the Monty Hall problem is unlike the problem of the three boxes and the marble. In the marble game, there is no restriction on which of the empty boxes is to be revealed as empty. The simply analysis by conditionalization is correct.
It is tempting to dismiss the influence of Monty Hall's knowledge by saying that the contestant does not know which of these two cases applies. That is not quite right. The contest does know that each case can arise with equal probability; and that is enough to change the probabilities concerning where the car is hidden.
The contestant can analyze the situation as follows:
If "case door 3" applies,
then there is no basis for preferring door 2 over door 1.
If "case door 2" applies, then the car is assuredly behind door 2.
Since "case door 2" might apply with some non-zero probability, while there is no assurance that the car is behind door 2, its probability is increased. All that remains is to determine how much the probability is increased.
This increase is quickly determined by seeing how Monty Hall's opening of door 3 has changed the game.
Originally, the contestant had to choose among 3 possibilities: door 1, door 2 or door 3.
After door 3 is opened, there are only two choices: stay with door 1 or switch to door 2.
"stay" Contestant wins if the car was hidden behind door 1. (Probability = 1/3)
"switch" Contestant wins if the car was originally hidden behind either or 2 or door 3. (Probability = 2/3)
With these new probabilities, the contestant increases the chances of winning the car by switching. Switching doubles the probability of winning the car fro 1/3 to 2/3.
In the last two sections, we saw two analyses of the Monty Hall problem. Superficially, on a rapid skim, it may seem that both are cogently argued. How can we reassure ourselves that the second analysis is the correct one? In cases like this, it is often helpful to map out the full outcome space. We identify the most specific set of all mutually exclusive outcomes. From them, we read off the probabilities from it. This is one of those cases.
When we undertake the exercise of mapping out the full outcome space, we find that Monty Hall's actions must be included. A more complete outcome space has, as its most specific outcomes, pairs consisting of the location of the hidden car and which door Monty Hall opens. The combination of all such pairs is conveniently represented in the table shown below. The cell in the top left arises from the car behind door 1 paired with Monty Hall opens door 1.
Car behind door 1 | Car behind door 2 | Car behind door 3 | |
Monty Hall opens door 1 | 0 | 0 | 0 |
Monty Hall opens door 2 | 1/6 | 0 | 1/3 |
Monty Hall opens door 3 | 1/6 | 1/3 | 0 |
The numbers in the cells are the probabilities of each outcome. Since Monty Hall will not open door 1, all cells in the first row are assigned probability zero. The probabilities in the remaining cells are easily recovered. If the car is behind door 2, then Monty Hall must open door 3. There is a probability of 1/3 that the car is behind door 2, so the corresponding cell is assigned probability 1/3.
The interesting case arises when the car is behind door 1. That cases arises with probability 1/3. In it, Monty Hall can choose to open door 2 or door 3 and, we assume, he does so with probability 1/2 for each. Thus the probability for each cell is 1/3 x 1/2 = 1/6 as shown.
When Monty Hall actually opens door 3, the first two rows of the outcome space are eliminated. We are left with the third row only. To ensure that all the probabilities sum to unity, the two probabilities of the third row--1/6 and 1/3--must be scaled up in proportion so that they sum to unity. That is, they become 1/3 and 2/3 and the reduced outcome space is:
Car behind door 1 | Car behind door 2 | Car behind door 3 | |
Monty Hall opens door 3 | 1/3 | 2/3 | 0 |
These are the new probabilities the contestant finds. The probability that the car is behind door 2 is twice that of the car behind door 1. The contestant's chances of winning double if the contestant swaps doors.
This last operation of eliminating two rows and scaling up the probabilities is carried out more formally by the rule of conditionalization. This is the correct way to apply the rule of conditionalization to the problem and correct the mistaken version shown earlier.
We read some unconditional probabilities from the table prior to elimination of the first two rows:
P(car-door-1 and MH-opens-3) = 1/6
P(car-door-2 and MH-opens-3) = 1/3
P(MH-opens-3) = 1/6 + 1/3 = 1/2
First, consider the probability that the car is behind door 1, given that Monty Hall has opened door 3:
P(car-door-1 | MH-opens-3)
= P(car-door-1 and MH-opens-3) / P(MH-opens-3)
= (1/6) / (1/2) = 1/3
Second, consider the probability that the car is behind door 2, given that Monty Hall has opened door 3:
P(car-door-2 | MH-opens-3)
= P(car-door-2 and MH-opens-3) / P(MH-opens-3)
= (1/3) / (1/2) = 2/3
These more formal computations agree with the probabilities assigned in the reduced outcome space.
How is it possible that we humans have survived so long in a chance filled world, given our evident difficulty with assessing chances correctly?
August 10, November 17, 25, 2021. May 5, 2022. April 9, 2023.
Copyright, John D. Norton