Ingrid Lönnstedt: What does the p-value mean?
I still remember when my mathematics teacher at college in Cornwall, Ms Dangerfield, explained how to test somebody’s claim. That was when I realized how extremely useful and powerful statistics is. What claim she tested? Let me get back to that.
In order to explain the p-value, we need to distinguish between two scenarios, two hypotheses. A classic example is that of the old British lady who claims she can tell a difference in tastes as to whether the milk was poured into her cup first, followed by the tea, or whether the tea was poured before the milk was added. The null hypothesis is the super boring pessimistic truth, that the lady cannot actually sense the difference. The alternative hypothesis is that the lady can in fact tell, and that is the scenario we would love to publish in the local newpaper if we manage to prove it.
Quantitative science is constructed so that we have to assume the grey and boring, the null hypothesis, as long as we haven’t been able to prove it implausible. Our medical treatment has no healing effect, there is no difference between treatment groups, no difference after compared to before treatment of our patients. In our case, the lady cannot tell the difference between the cups of tea – she is just guessing!
Let us design a test! We serve the lady a cup of hot English Breakfast tea, and the lady is asked to guess whether the milk or the tea was poured into the cup first. If she is right, do you think that is sufficient to convince us that she can tell a difference?
– Naaah, if she is just guessing, she has a 50% chance of getting it right. That doesn’t feel convincing to me, you probably say.
But if she had a guess on 10 cups of tea, and got them all right?
It sounds hard to guess 10 out of 10 by chance, don’t you agree with that? The probability of doing so is just 1 in 1000, or 0.1%. What if she does 9 out of 10? That sounds quite compelling as well. The probability of correctly guessing as many as 9 or 10 out of 10 is 1 in 100, ie 1%.
And that is exactly what the p-value is. The p-value is the probability that our lady hits the number she actually achieved in our little study, or more, IF she is only guessing, that is if null-hypothesis is true.
P is the probability that the study shows at least as convincing results as we observed, under the condition that the null hypothesis is true.
Both of these examples have significant p-values, below 0.05. The p-value for 8 correct guesses, on the other hand, is 0.055, just above 5% (one in 20). That means 8 hits is not enough for statistical significance, but it is close. I feel that guessing 8 out of 10 is quite an impressive achievement, and that in itself tells me something about the regulatory requirements in drug development. Indeed we need very strong evidence to get a p-value below 0.05, so we can use a clinical study as evidence towards a marketing application.
And yes, Ms Dangerfield’s example was neither about tea nor clinical studies, but about her friend who claimed he could taste the difference between red wines from different years.
This column has also been published in the newsletter "Statistik är mer än siffror".
Artikeln är en del av vårt tema om News in English.