Hypothesis testing versus conspiracy theory
In 2020, after 26 years at the bottom of the sea, MS Estonia was the subject of a documentary movie which shall throw authorities and common man around the Baltic sea back to square one.
The declared explanation of the unfortunate shipwreck may not be correct. Persuasive film clips and expert statements show plausible evidence that a hole in the ship’s hull appeared before the ship sank. Explosion, secret contraband vehicles, state secrets! The conspiracy theory spreads, the movie is praised, and authorities see themselves forced to start new, extensive investigations of a graveyard once protected.
How do you know what is a conspiracy theory and what is a reasonable, scientifically based conclusion? Is medical research also full of potentially nonsense conspiracy theories? Should babies sleep on their stomachs or on their backs to avoid sudden infant death syndrome? Should eggs be avoided for optimization of your cholesterol levels?
What may be published in media and what medical drugs may be marketed are of course two very different things. One out of several heavy pieces to this jigsaw is hypothesis testing. In this newsletter I emphasize the importance of hypothesis testing.
Quantitative research, including drug development, relies heavily on inference theory. Research takes place in small, distinct steps on a game plan where each step is a test of a hypothesis. If lucky, the hypothesis is confirmed, for example by the primary endpoint of a clinical trial falling out statistically significant.
The whimsical thing about a hypothesis test is that we must assume the opposite of what we hope to prove (we call this the null hypothesis, H0), and may only reject it in favour of the alternative hypothesis (H1, what we hope to prove) if data opposes the null hypothesis beyond reasonable doubt.
[You:] I have a treatment against cancer in by back pocket.
[Media:] Exciting, please tell me, and we can publish it.
[The scientist:] Let H0 be that your treatment does not have an effect. Assume H0 is true. What is then the probability that your study shows as high efficacy as what we actually observe?
The answer to the scientist’s question is called the p-value. If p is less than 0.05 we declare statistical significance and say that our study demonstrates an effect. Note that if we do not reach significance, we still have no evidence in favour of the null hypothesis being true. The null hypothesis is just a construction to help us draw conclusions about the alternative hypothesis.
So, we must assume that there is no treatment effect, that patients do not survive longer, and that the hole in Estonia’s hull was caused by the capsize and not the other way round, that the hole caused the capsize.
[You:] The hole in Estonia caused her to sink.
[Media:] Exciting, please tell me, and we can publish it.
[The scientist:] Let H0 be that the hole did not cause the ship to sink. Assume H0 is true. What is then the probability that there is now a hole in the hull of the ship?
The new examinations show the hole may very well, even most likely, have appeared after Estonia sunk, i.e. the p-value is large (larger than 0.05), and hence bears no evidence of the hole having caused the shipwreck.
Have a go yourself next time you suspect a conspiracy theory. Can you phrase the null hypothesis and the alternative hypothesis?
Of course, research does not automatically become credible just by a hypothesis test. Through repetition in independent studies we continuously build an overview, and a scientific field moves forward slowly along a winding road. Sometimes new views and questions lead to new studies, new knowledge and new perspectives.
Therefore, we nowadays put our babies to sleep on their backs, and we happily eat half a dozen eggs per day. We also do not claim Estonia sunk because of a hole in the hull.
This column has also been published in the newsletter "Statistik är mer än siffror".
Artikeln är en del av vårt tema om News in English.