I’ve always found Bayes Theorem hard to understand. Even after semesters of studying probability you can can eventually get the hang of the math involved, but developing a solid understanding of the real world implcations can be challenging.
The standard formula you often see is:
If we break up the formula, there are 4 parts, and 3 quantities that we need to find before we can get the answer.
This is the probability of A happening not knowing anything else about the problem.
This is the probability of B happening not knowing anything else about the problem. Sometimes its not possible to know this directly, but we can also find it based on this property:
We already know and is just , so we just need find and from the problem statement.
This is the probability of B happening given that A has happened.
This is what we are trying find and in words, it can be described as:
What is the probability of seeing A given that we have seen B?
or maybe:
What is the probability of A happening given what we know about B?
It can help to visualize the probability as a large square that can be divided into portions. The square can largely be divided into the probability of hypothesis A happening, and the probability of A not happening, . These 2 events are obviously, mutually exclusive, and so the entire probability space is covered by these 2 events.
Lets adjust the probability of A happening, , and see how the probability space changes.
Then, by looking only at the left, we can think about the probability of B happening given A, , and the product of these 2 together give us the are of the square in the lower left corner.
We can then look at the right to determine .
We now have everything we need to put these 3 quantities together to get the answer.
You find that a family member tested positive for a genetic defect. What are the odds they actually have the defect?
The doctor tells you that 1% of people have this certain genetic defect. He also tells you that 90% of tests for the gene detect the defect accurately and 9.6% of the tests give false positives.
Take a moment to think how likely it is that your family member has the defect. You probably think that is very high, greater than 50% at least right?
Lets take this information and figure out the odds using Bayes Theorem.
= (.9 * .01) / (.9 * .01 + .096 * .99) = 0.0865 (8.65%).
This is only an 8.65% chance that your family member has the defect. This is much lower than you probably thought.