Wednesday, January 04, 2006

Shor-Posner 2004: Statistical significance and alpha

Shor-Posner report that they performed statistical evaluations, "with alpha level at 0.05". What does that mean?

Short version: If there is less than 5% probability (1 chance in 20) that the result they got from their study was due to chance, rather than to a real treatment effect, they will consider their result to be "statistically significant". The way we get that 5% number is by converting the 0.05 that Shor-Posner gives us (5 out of a hundred) to percent notation, or 5%.

Longer version: Because we can't observe the mechanism of cause-and-effect directly (except in a few very particular cases), we have to make inferences, or reason, about the connection between the results we see and the treatment we are testing.

But inference is tricky--just because something happens many times in a row does not guarantee that it will happen again next time. If I see 100 patients for whom massage is effective in relieving pain, I may be confident, based on experience, that it will work for the 101st patient as well--but I cannot guarantee it, because I can't directly observe the mechanism. So if I want to create a study to test whether massage reduces pain, I can't say I will show that it reduces pain in any number n of people, and then guarantee that it will work in the n + 1th person--induction just is not robust enough to guarantee that.

What I can do is take advantage of an asymmetry (unevenness) in the way logic works. While I can't prove from a repeated positive result that something will work for everyone, I can take advantage of the fact that all it takes to disprove something logically is one example to the contrary. In other words, here is the asymmetry:

Hypothesis 1: Massage reduces pain

Patient 1: true, Patient 2: true, Patient 3: true, ... Patient n: can't guarantee in advance before I test on this patient

Hypothesis 2: Massage causes no change in pain (the null hypothesis)

Patient 1: false

Already with my first patient, I have disproved my null hypothesis, and thus strengthened (not proved) the opposite of the null hypothesis--in other words, I have shown that it is false that massage causes no change in pain, and therefore, I have strengthened the opposite hypothesis, that massage causes a change in (reduces) pain. (In reality, I would do this for many more than just one patient, but it's the same idea.)

You see the trick? Because the two situations aren't symmetrical, I can take advantage of that fact, couch my hypothesis as the null hypothesis, and try to disprove it, rather than pursuing the impossible goal of proving a positive hypothesis.

And that's where alpha comes in: alpha is how much error I am willing to accept in rejecting the null hypothesis. I can never totally eliminate the possibility that what I am seeing is pure chance, but I can make it very, very small--my alpha level of 0.05 (5%) or less. I could make it 0.02 (2%) or less, or 0.001 (0.1%) or less--whatever is appropriate. The point is that once I have set that level, it is my threshold for accepting or rejecting the null hypothesis. For Shor-Posner, if the probability of her finding occuring by chance is less than 5%, she will reject her null hypothesis (massage has no effect on the immune system of HIV+ Dominican children), and will consider her results--that massage does have a positive effect on their immune systems--statistically significant.

0 Comments:

Post a Comment

<< Home