## Translating SPSS to R: Factorial Between-Subjects ANOVA

Sorry for the lag between posts, but the semester’s been busy and I’ve been sick. Today’s topic is factorial between-subjects ANOVA, but with a particular constraint: the design discussed here will be balanced (or orthogonal) in that the number of subjects in each cell will be the same. The world gets much more complicated when things are not balanced and SPSS and R do not agree in how they handle that, so that situation will get its own post.

I start my students with almost, but not quite, the simplest two-way ANOVA possible, a 2 x 3. (A 2 x 2 doesn’t give much opportunity to do contrasts, which is why I don’t start there.) The data set we start with is this one: detection.xls. There is one dependent variable, “Targets” (number of targets detected by each subject), “Group” with two levels, air traffic controllers (ATCs) and regular college sophomores, and “Clutter” indexing the amount of clutter on the display with three levels: high, medium, and low. I’m not going to go into things like descriptives and box plots because the data set is clean and I’ve already covered that stuff in previous posts.

**Basic ANOVA**

The SPSS code for generating the basic two-way ANOVA and generating interaction plots is pretty simple. First, though, we’ll label the levels of the variables, then run the ANOVA, like this:

`ADD VALUE LABELS Group 0 "ATCs" 1 "Students".`

ADD VALUE LABELS Clutter 0 "High" 1 "Medium" 2 "Low".

`UNIANOVA Targets BY Group Clutter`

/PLOT = PROFILE(Clutter * Group)

/PLOT = PROFILE(Group * Clutter)

/PRINT ETASQ.

We get a nice table of factors, a very complete ANOVA table and two interaction plots that look like these:

For those that dislike partial eta squared as a measure of effect size (I’m not a huge fan myself), SPSS nicely reports all the relevant quantities so that calculation of eta-squared or omega-squared is pretty straightforward. The interaction plots are sort of stupid in that SPSS always zooms in the maximum amount to exaggerate differences, and the color choices aren’t great, but it’s not awful.

So, now R. As usual, there are a whole lot of different ways to get a simple ANOVA out of R. The most straightforward is just to use the “aov()” function for generating an ANOVA. It doesn’t produce interaction plots, but those are pretty easy to get. First we’ll create and label factors (assuming the data set is attached). The code looks like this:

`## Make factors`

ClutterF = factor(Clutter, labels=c("High", "Medium", "Low"))

GroupF = factor(Group, labels=c("ATCs", "Students"))

`## basic ANOVA`

Anv = aov(Targets ~ GroupF*ClutterF)

summary(Anv)

`## interaction plots`

interaction.plot(ClutterF, GroupF, Targets, type="b")

interaction.plot(GroupF, ClutterF, Targets, type="b")

And the output looks like this:

Df Sum Sq Mean Sq F value Pr(>F) GroupF 1 31 30.8 0.371 0.5448 ClutterF 2 4909 2454.5 29.586 2.11e-09 *** GroupF:ClutterF 2 538 269.0 3.243 0.0468 * Residuals 54 4480 83.0 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Again, there’s annoying whitespace in the tops of the R plots, but otherwise they’re no better or worse than the SPSS plots. R doesn’t automatically give us the partial eta squared values, but the ANOVA table has everything needed on it to compute them. It sure would be nice if the ANOVA table included the total sum of squares, though, as that would save some steps in the computation. On the other hand, because the table produced is a first-class object in R, it is possible to write a function that would compute relevant effect size metrics—but why should I have to do this? Why is this not even an option in R, when most journals (at least in psychology) now encourage or flat-out require reporting of effect size?

So, a slight edge to SPSS on effect size. So, what to do with these results? Well, I always instruct my students that they should deal with the interaction first, before even looking at the main effects. I teach them three ways to deal with interactions, though I don’t actually recommend they use one of them. The three ways are doing posthocs on cells, doing simple main effects, and doing interaction contrasts. I note that the last of those is probably the best approach but they should be aware of the others because they’ll see them in print and they should know how to do them. So let’s look at how to handle each of those in the two packages.

**Cell Posthocs**

Frankly, I hate these and wish people would stop doing them. If the interaction is really keyed by some pair of cells being different, there should be a decent contrast for it. Anyway, here’s how I teach my students to handle it: create a new variable representing the cells, then do a oneway ANOVA with that as the new independent variable, and look for pairwise differences there. Here’s the SPSS version of this:

`COMPUTE Cell = Clutter + 10*Group.`

EXECUTE.

ONEWAY Targets BY Cell

/POSTHOC QREGW.

I use “10” as the multiplier in creating the “Cell” variable because that way the cell numbers are indexed by row, column in the variable values. Here’s the relevant piece of the SPSS output:

In terms of interpreting the interaction, this is pretty close to useless, which is actually what I want them to learn. I like that SPSS produces this in a nice, easy to read table. The equivalent R code requires loading a library to get Ryan’s Q (as mentioned last time) and looks like this:

`library(mutoss)`

clutter$CellF = factor(Clutter + 10*Group)

regwq(Targets ~ CellF, data=clutter, alpha=.05)

And produces this output (truncated for the sake of brevity, if you can believe that):

#----REGWQ - Ryan / Einot and Gabriel / Welsch test procedure Number of hyp.: 15 Number of rej.: 10 rejected pValues adjPValues 1 1 0 0 2 3 0 0 3 5 0.0016 0.0016 4 4 0.0017 0.0017 5 6 0.0019 0.0019 6 8 0.0061 0.0061 7 7 0.0081 0.0081 8 10 0.0087 0.0087 9 12 0.0163 0.0163 10 2 3e-04 3e-04 $adjPValues [1] 0 3e-04 0 0.0017 0.0016 0.0019 0.0081 0.0061 0.3484 0.0087 0.526 0.0163 0.4646 0.5103 0.0184 Levels: 0 0.0016 0.0017 0.0019 0.0061 0.0081 0.0087 0.0163 0.0184 0.3484 0.4646 0.5103 0.526 3e-04`$rejected [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE`

`$statistic [1] 9.8254 6.3882 8.9227 5.4508 5.4855 5.4161 4.4093 4.5481 1.9790 4.3745 0.9027 3.5066 1.0416 0.9374 3.4371`

$confIntervals [,1] [,2] [,3] 0-2 28.3 NA NA 10-2 18.4 NA NA 0-12 25.7 NA NA 11-2 15.7 NA NA 10-12 15.8 NA NA 0-1 15.6 NA NA 1-2 12.7 NA NA 11-12 13.1 NA NA 10-1 5.7 NA NA 0-11 12.6 NA NA 12-2 2.6 NA NA 1-12 10.1 NA NA 11-1 3.0 NA NA 10-11 2.7 NA NA 0-10 9.9 NA NA

And, as previously mentioned, this is a terrifically awful output format. Yes, I get that posthocs are frowned upon in the statistics purist community and I agree that they are often abused and should be discouraged, but as a usability person I’m still stupefied by how needlessly hostile this presentation is.

Again, though, the good news is that perhaps students learn not to try to interpret interactions this way.

**Simple Main Effects**

Many people just call this “simple effects” and that’s fine, my old stats professor in grad school called it “simple main effects” so that’s how it reads back in my head. This procedure is common but is also inherently flawed since it relies on accepting the null of “no effect” in some conditions. As a result, I don’t really like it that much, either, but psychologists aren’t going to stop doing it anytime soon and so students should learn what it is. In this case I’ll just show the version where we split on the “Group” variable and look at whether or not there’s an effect of “Clutter.” SPSS supports this pretty well, actually:

`SORT CASES BY Group.`

SPLIT FILE BY Group.

UNIANOVA Targets BY Clutter.

SPLIT FILE OFF.

The relevant part of the output looks like this:

I do note that they shouldn’t actually directly use these results and should use the error term from the original ANOVA. Some of them even remember to do that. Now, R. As usual, I’m sure there are about a zillion ways to do this in R, probably involving “by()” or some “ply” function, but hey, the first language I learned was BASIC and the second was FORTRAN, so I’m using a for loop, dammit:

`for (Grp in levels(GroupF)) {`

cat("\n------ Group:", Grp, "-----\n")

print(summary(aov(Targets[GroupF == Grp] ~ ClutterF[GroupF == Grp])))

}

I put in the little header there to make it easier to read. Here’s the output:

------ Group: ATCs ----- Df Sum Sq Mean Sq F value Pr(>F) ClutterF[GroupF == Grp] 2 4018 2009.2 28.12 2.5e-07 *** Residuals 27 1929 71.4 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 ------ Group: Students ----- Df Sum Sq Mean Sq F value Pr(>F) ClutterF[GroupF == Grp] 2 1428 714.2 7.56 0.00247 ** Residuals 27 2551 94.5 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Great, very comparable, though I could do without all the “significance codes” garbage. I think the syntax to generate this is a little easier to understand in SPSS for the novice, but the R isn’t *that* bad, and the mere fact that it has a for loop available is a big win in my book.

**Interaction Contrasts**

OK, gentle reader, this is the part of the show that makes me want to find the SPSS programmers and do unspeakable things to them. SPSS’s handling of interaction contrasts is, to put it bluntly, horrific. In a straightforward world, I’d generate a contrast on one factor, generate the contrast on the other factor, and the software would do the cross-multiplication, and everything would be great. But no, SPSS will **not** do the cross multiplication! Here’s the best part: in within-subjects ANOVA with SPSS’s GLM procedure, it **does** do the cross-multiplication! So why, for the love of all that is good in the world, doesn’t it do this for between-subjects factors? Why? WHY?

Oh, sorry, back the the story. So, I teach my students that there are two ways to cajole SPSS into producing an interaction contrast: code the design into individual cells (like with posthocs on cells) and apply the hand-multiplied vector to that, or use the /LMATRIX subcommand supported by SPSS, which unfortunately does not work at all if the design is anything other than a two-way (seriously, I am not making that up). So, let’s say the contrast we want on “Clutter” is (-2 1 1), which is “high” vs. the mean of the other two, and of course the only contrast on “Group” is (1 -1) since it only has two levels. Coding it by cells, the SPSS code looks like this:

`UNIANOVA Targets BY Cell`

/CONTRAST(Cell) = SPECIAL(-2 1 1 2 -1 -1).

Note that this is a total minefield, because the user has to remember exactly how the “Cell” variable was coded and if the order of the coefficients is screwed up, SPSS will happily spit out a contrast—it’ll just be garbage. Always assuring for new students. Anyway, the output:

If they want to use /LMATRIX, then it looks like this:

`UNIANOVA Targets BY Group Clutter`

/LMATRIX "Reverse at high" Group*Clutter

-2 1 1 2 -1 -1.

At least there they get to put in a label. The best part there, though, is that the coefficient order is dictated by the order of the IVs in the “BY” part of the syntax, not the order in the LMATRIX command itself. Awesome. Anyway, the output is basically the same:

Plus an ANOVA table that is identical to the one above. On the plus side, it does at least spit out the contrast value, so that’s something, but it’s pretty weak sauce.This whole process is so mind-numbingly unnecessary I find it hard to believe. I just know that internally SPSS is doing the ANOVA as a multiple regression anyway—why not just replace the internal vectors with the user-supplied ones? Seriously.

Of course, this is exactly what R does. Thank you, R, for a breath of sanity. R knows that contrasts are a property of the independent variables and allows you to set them—hey, how about that! So, you just set it, tell R to label it in the output at the output step, and away you go:

`contrasts(ClutterF) = cbind(c(-2, 1, 1))`

ClutLab = list("Hi v Oth"=1)

AnvL1 = aov(Targets ~ GroupF*ClutterF)

summary(AnvL1, split=list(ClutterF=ClutLab))

And you get this output:

Df Sum Sq Mean Sq F value Pr(>F) GroupF 1 31 31 0.371 0.5448 ClutterF 2 4909 2454 29.586 2.11e-09 *** ClutterF: Hi v Oth 1 3245 3245 39.112 6.66e-08 *** GroupF:ClutterF 2 538 269 3.243 0.0468 * GroupF:ClutterF: Hi v Oth 1 538 538 6.481 0.0138 * Residuals 54 4480 83 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Hey, look, the same *F*-value without having to do the matrix multiply on your own. How about that? Was that really so hard, SPSS? Really?

**Main Effect Contrasts**

If you’re doing an interaction contrast in R, you get the main effect contrast for free—see above. If you want the equivalent main effect contrast in SPSS, that at least is relatively easy:

`UNIANOVA Targets BY Group Clutter`

/CONTRAST(Clutter) = SPECIAL(-2 1 1).

You get the same kind of output as in previous contrasts:

Thankfully, that matches the R output from the above, with that added bonus that you get the actual contrast estimate and standard error, too, if you need them. Note that I’m not advocating this particular analysis in this case, as the interaction is what’s really relevant—this is just how the code works in case the interaction is not significant (in both senses of the term) and a main effect contrast is called for.

**Main Effect Posthocs**

As previously noted, posthocs are not exactly a strength in R. SPSS, on the other hand, makes them easy. To do a Ryan’s Q posthoc on the main effect of “Clutter,” this is the appropriate SPSS code:

`UNIANOVA Targets BY Group Clutter`

/POSTHOC = Clutter(QREGW).

And here’s the output:

Usefully noting that all three marginal means are distinguishable. As noted, R sucks at posthocs. Getting Ryan’s Q for a single factor within a factorial design is less than wonderful, as you have to manually supply the error term d.f. and MSE, because the procedure doesn’t natively understand factorial designs (yes, really):

`regwq(Targets~ClutterF, data=clutter, alpha=.05, df=54, MSE=82.961)`

And here’s the relevant part of the output:

#----REGWQ - Ryan / Einot and Gabriel / Welsch test procedure Number of hyp.: 3 Number of rej.: 3 rejected pValues adjPValues 1 1 0 0 2 2 0 0 3 3 0.0025 0.0025 $adjPValues [1] 0 0 0.0025 Levels: 0 0.0025 $rejected [1] TRUE TRUE TRUE $statistic [1] 10.8265 6.3338 4.4926 $confIntervals [,1] [,2] [,3] High-Low 22.05 NA NA Medium-Low 12.90 NA NA High-Medium 9.15 NA NA

I’ll be the first to agree with the R purists that this is not the best kind of analysis to be doing, but nonetheless, there are some times when it’s all you’ve got, so doing it should be less convoluted.

**Bottom Line**

Most of this is easier and/or more straightforward in SPSS, but with one fatal weakness. The particular beef I have with R is that it doesn’t do a good job of supporting effect size computation because it doesn’t output total sum of squares. (I realize this is not hard to compute, but my point is that I shouldn’t have to.) On the other hand, SPSS’s handling of interaction contrasts is unforgivably awful. Since I firmly believe that’s often the best way to help interpret an interaction, this is a pretty serious failing. I have to give the edge to R here.