Simple researcher use ANOVA

At a recent academic conference I listened to a talk of a researcher. In their talk, they told the audience about the great experiment he did and how the preliminary analysis uncovered a 4-way interaction.

When I heard this and looked at the slide, I first thought I was hallucinating. Firstly, there was a p-value of 0.09. Then there was an N of 53. Then I saw eta squared of 0.01. But my head was still at “four-way interaction”. I was thinking: What on earth is this? And how did he con himself into believing that this was real?

People, we have to talk about a severe problem with statistics and a technique that has become a bad default. And yes, it is the ANOVA. Used one of those lately? Yes? Then buckle up. This text will probably be somewhat offensive to your morale. Because I haven’t. And I think you shouldn’t have. And I also think you shouldn’t do it again. But no worries, I will show you something greater – something that is the same, just better.

Firstly let me tell you that I firmly believe that ANOVA is a legitimate method of analysis in a few use cases – very few. “But factorial data is everywhere” you might exclaim, thinking about all the experimental conditions and groups you have been conditioned to apply during your training. I’m sorry to inform you: Most of your conditions are clever ways to scam yourself into the belief you’re doing something worthwhile, while your variance sits in the corner and cries because you neglected it. But let me start.

The typical ANOVA is a special form of a regression. It looks at factorial data and estimates the influence of the factors on the overall variance of a metric outcome. This is why in R, you can also do an ANOVA with the lm() function – a fact unkown to a lot of people, surprisingly.

The thing is just that for some reason, psychologists are taught that an ANOVA is a nice thing to do because often we research factorial influence on data. In fact, I myself was taught ANOVA before regression in my stats classes. And after reading your first ancient social psych papers you will probably assume as a relatively fresh student that ANOVAs are common and a good method for anything that comes your way.

But they are not. Why? Here’s a few reasons:

  1. Most variables in psychology are quasimetric and modern regression estimators can incorporate them well.
  2. ANOVAs are imho the reason why people still do shit like age groups, or factorizing metric variables in general, which is not good practice.
  3. Calculating your N for decent power is very dicey once you include interactions.
  4. Most people cannot safely and easily interpret higher order (>1) interactions (Occam recommends don’t do it!) AND
  5. Most interactions are statistical noise as they have severly lower power than main effects.

But the last, and probably most controversial reason is: ANOVAs can make quasi-causal analyses (and therefore fraud) so much easier. As a psychologist you learn that, for a causal interpretation, you need experimental conditions, i.e. control groups and effect groups. If you have that, you can create a factor that indicates the condition and use it to evaluate the cause of your outcome. Unfortunately, a lot of people think the factor is the important thing, which it is not, especially if both conditions rely on a variable that has a baseline in every participant. Let’s say you want to know whether high stress makes you sweat. You sample a thousand people and put 50% in some sort of high stress condition and the other ones you leave as they are. Obviously, you have to measure stress in both cases to prove that your control group is indeed the control, and that the high stress group has high stress, right? And if your ANOVA tells your high stress group is significant, you will have a causal effect, right?
Answer: No. You uncovered a correlation between stress and sweating and factorized your IV.

But what if we do this within-person? We can do a RM ANOVA then and we don’t compare different people anymore!
Answer: Well, same story, apart from the fact that now you made every participant uncomfortable.

You cannot show causality without conditional absence in one part. And by absence, I mean “not even a trace”. This is what makes drug research easy considering operationalization. Obviously, if you give medicine to a patient and nothing to another, that fulfills the absence rule. But what about stress? Everybody has a baseline level of stress. All you can do is say “well we have a significant effect of stress on sweating”. But that’s a correlation.

I hope you can see what I’m trying to get at: An ANOVA makes you believe things that aren’t there. Like stress levels or age groups. We as psychologists (or social scientists for that matter) should know about the biases we all have. And those biases are telling us to do the easy thing, always. Even if we need to factorize our ages and stress metrics. Because an ANOVA is so easy to interpret, right? It’s so comfy and we don’t need to think too much about the whole thing. And I’m not excluding myself here, I’ve done that. But that experience is why I deem it dangerous.

A regression output can be hard to interpret, I know. And you can’t write sexy things about the data compared to an ANOVA that makes your writing so much more easy and interesting. But we gotta remember one thing: statistics relies on models. All they do is look at the data we give them in their own way. It is our job to say “this fits” or “this doesn’t”. But much like clothing, you can determine whether your result is ugly or not. You can make regressions bloom like an ANOVA, without sacrificing variance, interpretability, or integrity. Think about your designs. Think about your variables. Think about your models. Interpret your results with care. And don’t use ANOVAs. Your readers and the community will be better off.



Leave a comment

Design a site like this with WordPress.com
Get started