- Bayesian inference explained by doing - Bayes Calculator

Condition positive (CP) Condition negative (CN)  
Test outcome
positive (OP)
True positive
False positive
All OP
Test outcome
negative (ON)
False negative
True negative
All ON
Base Rate
All CP
All CN
All Tested

P(H|E) = =

= = = = %;

Examples of Baysian inference

Antibody Blood Test By Alden Chadwick from Leeds, UK - Blood Test, CC BY 2.0, Link

Bayes' Theorem, Bayesian Inference

Suppose, you do a COVID-19 blood test at home. Test kits are now available. Vendors guarantee around 99% specificity and 97-99% sensitivity. What does this mean? A vast majority of subjects, including academics would answer: “if the test finds antibodies, I am 99% sure it's true” and accordingly “if I have no antibodies, in 97% of the cases the test will come out negative”.

But it is wrong! Imagine, you were falsely tested positive (meaning, you have antibodies and are - probably - safe) and you go to work and don't wear a mask! This means, the test would cause you to run a big risk.

Let's do the math: suppose, the overall probabilty, the base rate, of having antibodies is 0.003 (US: 1.68 MM infected out of 330 MM are 0.005, Germany: 0.18MM / 83MM = .002 - data by end of May, 2020).

Suppose 10000 people are doing the test you have done. According to the base rate (.01), 100 of them will have antibodies. 99 of these get a positive test outcome (as the specificity is .99). The other 9900 should yield a negative test outcome. However, since the test specificity is not 100%, but only 98%, from these 9900 will 9000*0.02 (.02 = 1 - .98) yield a falsely positive outcome.

The result will come as a suprise! You gonna rub your eyes in disbelief! This app will help you to understand, what's going on here - before you learn the math. Toggle terms and definitions  

Terms and definitions

Today's example: antibody COVID-19 tests. We are going to use three variables:

  1. How certain is it that someone having the "condition" (that is, having antibodies) will test positive (for an ideal test this would be 100% certain). This probability is denoted as sensitivity.
  2. How certain is it that someone not having the condition will test negative, as it should be (that is, she is clean, hasn't ever had COVID-19). This probablity is denoted as specificity.
  3. What is the magnitude of the outbreak? It is the fraction of all subjects and hence the fraction of "all", for instance, all inhabitants of a country. (The subjects should form a representative sample of the population.) This probabilty is denoted as base rate - or, if it is a disease, as prevalence

Then, we are dealing with two "binary" variables, meaning they can assume but two values: true or false.

  1. The condition, we call it H for Hypothesis. If the condition is given, the hypothesis is true.
  2. The test result, we call it E for Evidence. If the test outcome is positive, there is evidence for the hypotheses, the evidence is true

Finally, the quantity we are interested in is called P(H|E), meaning the probabilty of H given E (or, for H being true given that the there is evidence, namely a positve antibody test).

All this definitions and symbols may feel a little overwhelming. But the good new is: that's already it!