Home > Uncategorized > Translating SPSS to R: Mixed Repeated-Measures ANOVA

Translating SPSS to R: Mixed Repeated-Measures ANOVA

As usual, it’s been far too long since I’ve posted, but the fall semester is coming and I’ve been ramping back up on both SPSS and R lately and I’d like to get in a couple more posts to finish off this series. Thus, the return.

This post will cover a simple mixed repeated-measures ANOVA. That is, an ANOVA with both within-subjects and between-subjects factors. I’ll continue to use the Elashoff data set that I used in the last post; the data are in the file elashoff.xls. The data file is just as described in that last post, with 11 variables: subject number, the group number (the between-subjects variables) an then 9 measurements that make up the 3 x 3 within-subjects part of the design.

In the within-subjects part of the design, there were main effects of drug type and dosage, but no interactions. Adding the between-subjects effect actually generates an interaction, which looks like this:

Interaction graph

Basic Mixed ANOVA

Basic ANOVA in SPSS

Again, SPSS makes the easy things easy, and the code to get SPSS to do this is straightforward:


GLM T1D1 to T3D3 BY Group
/WSFACTOR Type 3 Dose 3
/PRINT ETASQ.

This is the same as the entirely within-subjects code, but with the addition of “BY Group” which tells SPSS to include a between-subjects variable. Simple, no?

SPSS produces the same printout that shows the pairings between the individual variables and the levels in the repeated-measures design so I won’t repeat that here. In addition, since there is now a between-subjects effect, it provides a listing of the number of subjects in each between-subjects group, like this:

groupNumbers

The next thing it produces is the sphericity output, which is now different because of the between-subjects factor. That looks like this:

sphericity

And here’s our first really, really serious problem. If you have a between-subjects factor, SPSS’s computation of the Hunh-Feldt epsilon is WRONG. Yep, it’s just plain incorrect. SPSS has known about this bug for decades, but hasn’t fixed it yet. Hard to believe, but true. It’s not way off, but it’s a little bit off. R gets it right, below, so look there to see the correct number. How they get away with this year after year, version after version, is simply beyond me.

Next is the big ANOVA table with all the within-subjects factors. It’s big and ugly, but it’s all there:

WS ANOVA table

Same basic effects for all the purely within-subjects parts, but now we pick up the interaction of “group” and drug type, which was evident in the graph.

Also notice that this time I requested effect size estimates. SPSS only gives partial eta squared, which is probably the worst of them all, but it’s (very slightly) better than nothing. Finally, SPSS provides the tests of between-subjects effects, which in this case isn’t very interesting, but it is necessary for completeness. Here’s that result:

Between ANOVA results

And there you go. SPSS produces a lot of other output, including multivariate tests and all kinds of default contrasts, but that’s not new here.

Basic ANOVA in R, Wide Format

This is, in fact, also a fairly simple modification of the strictly within-subjects code from last time as well. There are two main differences. First, of course, is that R has to be told explicitly that something is a factor. Second, R needs to have a default set so that it computes sums of squares properly for the mixed design. That’s the “options” statement in the code right after the “car” library is loaded.

The code looks like this (assuming as usual that the data are attached so variables are local):


## Create labelled factors for w-s factors
Type = factor(c("T1", "T1", "T1", "T2", "T2", "T2", "T3", "T3", "T3"))
Dose = factor(c("D1", "D2", "D3", "D1", "D2", "D3", "D1", "D2", "D3"))

## shorter, but equivalent, way to do this
Type = factor(c(rep("T1", 3), rep("T2", 3), rep("T3",3)))
Dose = factor(rep(c("D1", "D2", "D3"), 3))

## load "car" library and set options
library(car)
options(contrasts = c("contr.sum","contr.poly"))

## Add the between-subects factor
GroupF = factor(Group, labels=c("No Sleep", "1hr Sleep"))

## run the ANOVA and print a summary
Anv3 = lm(cbind(T1D1, T1D2, T1D3, T2D1, T2D2, T2D3, T3D1, T3D2, T3D3)~GroupF)
Anv4 = Anova(Anv3, idata=data.frame(Type, Dose), idesign=~Type*Dose, type=3)
summary(Anv4, multivariate=F)

And again, even for this relatively simple three-way design, there’s a fair amount of output:

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

                    SS num Df Error SS den Df         F    Pr(>F)    
(Intercept)      79900      1   630.11     14 1775.2523 3.774e-16 ***
GroupF             187      1   630.11     14    4.1499 0.0610036 .  
Type               426      2   599.64     28    9.9444 0.0005457 ***
GroupF:Type        410      2   599.64     28    9.5669 0.0006816 ***
Dose               878      2   549.89     28   22.3635 1.572e-06 ***
GroupF:Dose        102      2   549.89     28    2.6068 0.0915784 .  
Type:Dose           46      4   554.11     56    1.1556 0.3402490    
GroupF:Type:Dose    27      4   554.11     56    0.6776 0.6103418    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Mauchly Tests for Sphericity

                 Test statistic  p-value
Type                    0.25246 0.000130
GroupF:Type             0.25246 0.000130
Dose                    0.75866 0.166077
GroupF:Dose             0.75866 0.166077
Type:Dose               0.17106 0.009729
GroupF:Type:Dose        0.17106 0.009729


Greenhouse-Geisser and Huynh-Feldt Corrections
 for Departure from Sphericity

                  GG eps Pr(>F[GG])    
Type             0.57223   0.004847 ** 
GroupF:Type      0.57223   0.005548 ** 
Dose             0.80558  1.232e-05 ***
GroupF:Dose      0.80558   0.105049    
Type:Dose        0.52518   0.330823    
GroupF:Type:Dose 0.52518   0.522531    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

                    HF eps   Pr(>F[HF])
Type             0.5898996 4.424443e-03
GroupF:Type      0.5898996 5.082647e-03
Dose             0.8946538 4.788591e-06
GroupF:Dose      0.8946538 9.867570e-02
Type:Dose        0.6200127 3.347825e-01
GroupF:Type:Dose 0.6200127 5.448314e-01

And yes, these are the correct HF epsilons. What SPSS is doing here I don’t know. So, other than that, you get the same thing out of R that you get out of SPSS, if in a somewhat less appealing format and with more complex code.

Basic ANOVA in R, Long Format

Again, this requires conversion of the wide-format data into long format. This is accomplished with the same code as in the entirely within-subjects example, and simply requires the addition of the options set and coding of “group” into a factor as well. It looks like this:


## melt into long form
library(reshape2)
stim.lng = melt(stim, id=c("Snum", "Group"), variable.name = "Cond", value.name="Alrtns")

## code separate factors for Type and Dose
stim.lng$TypeF = factor(substr(stim.lng$Cond, 1, 2))
stim.lng$DoseF = factor(substr(stim.lng$Cond, 3, 4))

## Make Snum into a factor
stim.lng$Snum = factor(stim.lng$Snum)

## make Group into a factor
stim.lng$GroupF = factor(stim.lng$Group, labels=c(“No Sleep”, “1hr Sleep”))

## load library and set options
library(ez)
options(contrasts = c(“contr.sum”,”contr.poly”))

## run the ANOVA
ezANOVA(stim.lng, dv=Alrtns, wid=Snum, within=.(TypeF, DoseF), between=GroupF, detailed=T)

The ezANOVA package is nice, to be sure, looking almost like SPSS. Here’s the output:

$ANOVA
              Effect DFn DFd         SSn      SSd            F            p p<.05        ges
1        (Intercept)   1  14 79900.44444 630.1111 1775.2523364 3.774108e-16     * 0.97162069
2             GroupF   1  14   186.77778 630.1111    4.1498854 6.100363e-02       0.07410265
3              TypeF   2  28   425.93056 599.6389    9.9443647 5.456901e-04     * 0.15434053
5              DoseF   2  28   878.38889 549.8889   22.3635078 1.572069e-06     * 0.27345919
4       GroupF:TypeF   2  28   409.76389 599.6389    9.5669153 6.816385e-04     * 0.14935732
6       GroupF:DoseF   2  28   102.38889 549.8889    2.6067893 9.157840e-02       0.04202917
7        TypeF:DoseF   4  56    45.73611 554.1111    1.1555544 3.402490e-01       0.01922100
8 GroupF:TypeF:DoseF   4  56    26.81944 554.1111    0.6776118 6.103418e-01       0.01136143

$`Mauchly's Test for Sphericity`
              Effect         W            p p<.05
3              TypeF 0.2524563 0.0001300799     *
4       GroupF:TypeF 0.2524563 0.0001300799     *
5              DoseF 0.7586605 0.1660766393      
6       GroupF:DoseF 0.7586605 0.1660766393      
7        TypeF:DoseF 0.1710604 0.0097290347     *
8 GroupF:TypeF:DoseF 0.1710604 0.0097290347     *

$`Sphericity Corrections`
              Effect       GGe        p[GG] p[GG]<.05       HFe        p[HF] p[HF]<.05
3              TypeF 0.5722318 4.846844e-03         * 0.5898996 4.424443e-03         *
4       GroupF:TypeF 0.5722318 5.547608e-03         * 0.5898996 5.082647e-03         *
5              DoseF 0.8055814 1.231551e-05         * 0.8946538 4.788591e-06         *
6       GroupF:DoseF 0.8055814 1.050485e-01           0.8946538 9.867570e-02          
7        TypeF:DoseF 0.5251803 3.308227e-01           0.6200127 3.347825e-01          
8 GroupF:TypeF:DoseF 0.5251803 5.225310e-01           0.6200127 5.448314e-01          

I also like this better than the car::ANOVA raw output. The problem with long form is contrasts, as was true with strictly repeated-measures designs.

Simple Main Effects

I once again have no desire to get into arguments about terminology for or appropriateness of such analyses. People do them and will continue to do them and I’m just going to leave it at that. The good news is that this time, there actually is a potentially meaningful interaction in the data, and it crosses a between-subjects factor with a within-subjects factor, so there is some possible work to be done in generating the code to handle it.

From visual inspection, what appears to be happening is no effect of drug type in the “no sleep” group with an effect apparent in the “1hr sleep” group. So we’ll do two repeated-measures ANOVAs, one for each group, and looking for the effect of drug type in both cases.

Simple Main Effects in SPSS

This uses the same SORT/SPLIT idiom that we saw with between-subjects ANOVA. The code looks like this:


SORT CASES BY Group.
SPLIT FILE BY Group.
GLM T1D1 to T3D3
/WSFACTOR Type 3 Dose 3.
SPLIT FILE OFF.

And here are the relevant parts of the output. First, sphericity stuff:

SME sphericity

And then the actual ANOVA table. Yes, this is getting pretty long:

SME ANOVA table

This shows what we’d expect here, an effect of drug type for the 1hr Sleep group and no effect for the No Sleep group. (Apply all the usual horrible caveats that apply to this type of inference logic.)

Simple Main Effects in R, Wide Format

Just as we used the same idiom we used for between-subjects in SPSS, we will use the same idiom that we used in between-subjects ANOVA in R, which is a for() loop. I know people like to try to vectorize everything in R, but if you ever have enough groups that this really matters for performance reasons, there are other issues. So, the R code looks like this:


for (Grp in levels(stim$GroupF)) {
cat("\n------ Group:", Grp, "-----\n")
SubFrame = stim[GroupF == Grp,]
Anv1 = lm(cbind(T1D1, T1D2, T1D3, T2D1, T2D2, T2D3, T3D1, T3D2, T3D3)~1, data=SubFrame)
Anv2 = Anova(Anv1, idata=data.frame(Type, Dose), idesign=~Type*Dose, data=SubFrame, type=3)
print(summary(Anv2, multivariate=F))
}

That is, iterate through the groups, select the subset of data for that group, then run the repeated-measures ANOVA for each group. It’s not super-clean and there’s probably some better way to do this with some kind of by() statement, but this works well enough for me. Here’s the output:

------ Group: No Sleep -----

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

               SS num Df Error SS den Df         F    Pr(>F)    
(Intercept) 36180      1   127.06      7 1993.3288 7.388e-10 ***
Type            0      2   331.53     14    0.0053 0.9947373    
Dose          784      2   345.03     14   15.9077 0.0002488 ***
Type:Dose      19      4   364.89     28    0.3581 0.8361991    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Mauchly Tests for Sphericity

          Test statistic  p-value
Type             0.22647 0.011616
Dose             0.51826 0.139202
Type:Dose        0.01185 0.006061


Greenhouse-Geisser and Huynh-Feldt Corrections
 for Departure from Sphericity

           GG eps Pr(>F[GG])   
Type      0.56385    0.95905   
Dose      0.67488    0.00179 **
Type:Dose 0.37653    0.64898   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

             HF eps   Pr(>F[HF])
Type      0.5978545 0.9652534118
Dose      0.7785622 0.0009500507
Type:Dose 0.4572805 0.6875468299

------ Group: 1hr Sleep -----

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

               SS num Df Error SS den Df        F    Pr(>F)    
(Intercept) 43907      1   503.06      7 610.9605 4.522e-08 ***
Type          835      2   268.11     14  21.8123 4.996e-05 ***
Dose          197      2   204.86     14   6.7209  0.008995 ** 
Type:Dose      54      4   189.22     28   1.9935  0.122878    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Mauchly Tests for Sphericity

          Test statistic p-value
Type             0.14145 0.00283
Dose             0.86525 0.64777
Type:Dose        0.38241 0.82632


Greenhouse-Geisser and Huynh-Feldt Corrections
 for Departure from Sphericity

           GG eps Pr(>F[GG])   
Type      0.53805   0.001702 **
Dose      0.88125   0.012432 * 
Type:Dose 0.77327   0.143502   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

             HF eps  Pr(>F[HF])
Type      0.5578137 0.001460198
Dose      1.1551317 0.008994918
Type:Dose 1.4554011 0.122878219

And again, it all agrees with the SPSS, which is what we’re looking for.

Simple Main Effects in R, Long Format

Again, same idea as between-subjects, using a for() loop, assuming the data now in long format in the “stim.lng” data frame. Code:


for (Grp in levels(stim.lng$GroupF)) {
cat("\n------ Group:", Grp, "-----\n")
print(ezANOVA(stim.lng[stim.lng$GroupF==Grp,], dv=Alrtns, wid=Snum, within=.(TypeF, DoseF), detailed=T))
}

And output:

------ Group: No Sleep -----
Warning: You have removed one or more Ss from the analysis. Refactoring "Snum" for ANOVA.
$ANOVA
       Effect DFn DFd         SSn      SSd            F            p p<.05          ges
1 (Intercept)   1   7 36180.50000 127.0556 1.993329e+03 7.387751e-10     * 0.9687140218
2       TypeF   2  14     0.25000 331.5278 5.278592e-03 9.947373e-01       0.0002139037
3       DoseF   2  14   784.08333 345.0278 1.590766e+01 2.487835e-04     * 0.4015620332
4 TypeF:DoseF   4  28    18.66667 364.8889 3.580999e-01 8.361991e-01       0.0157237119

$`Mauchly's Test for Sphericity`
       Effect         W           p p<.05
2       TypeF 0.2264721 0.011615672     *
3       DoseF 0.5182605 0.139201627      
4 TypeF:DoseF 0.0118548 0.006060616     *

$`Sphericity Corrections`
       Effect       GGe       p[GG] p[GG]<.05       HFe        p[HF] p[HF]<.05
2       TypeF 0.5638479 0.959053605           0.5978545 0.9652534118          
3       DoseF 0.6748825 0.001789689         * 0.7785622 0.0009500507         *
4 TypeF:DoseF 0.3765304 0.648979117           0.4572805 0.6875468299          


------ Group: 1hr Sleep -----
Warning: You have removed one or more Ss from the analysis. Refactoring "Snum" for ANOVA.
$ANOVA
       Effect DFn DFd         SSn      SSd          F            p p<.05        ges
1 (Intercept)   1   7 43906.72222 503.0556 610.960464 4.522047e-08     * 0.97414690
2       TypeF   2  14   835.44444 268.1111  21.812267 4.996251e-05     * 0.41757723
3       DoseF   2  14   196.69444 204.8611   6.720949 8.994918e-03     * 0.14442178
4 TypeF:DoseF   4  28    53.88889 189.2222   1.993541 1.228782e-01       0.04420242

$`Mauchly's Test for Sphericity`
       Effect         W          p p<.05
2       TypeF 0.1414489 0.00283008     *
3       DoseF 0.8652485 0.64777264      
4 TypeF:DoseF 0.3824079 0.82631913      

$`Sphericity Corrections`
       Effect       GGe       p[GG] p[GG]<.05       HFe       p[HF] p[HF]<.05
2       TypeF 0.5380535 0.001701661         * 0.5578137 0.001460198         *
3       DoseF 0.8812502 0.012431675         * 1.1551317 0.008994918         *
4 TypeF:DoseF 0.7732682 0.143501506           1.4554011 0.122878219          

Yes, there are warning messages, but they’re not important here. Again, agrees with the previous two results, so that’s good. (Note that it is possible in R to change the number of decimals output, but then it messes up other procedures that make assumptions. This is just pure stupid. R people seem to not understand basic UI principles like “the default settings are the UI 90% of the time.”) Anyway, this works as well, with somewhat more compact if over-decimaled output.

Interaction Contrasts

As noted in previous posts, I think this is a dramatically better way of handling the interpretation of interactions. In this particular case, the relevant interaction contrast is pretty clear, since there are only two levels of Group, meaning there’s only one contrast there, (1 -1). The contrast on drug type is (1 -2 1), since it appears that the second type is the one that’s different in the 1hr Sleep condition. So we’ll build that in both packages.

Interaciton Contrats in SPSS

As far as I can tell, the best way to handle interactions that involve both within- and between-subjects factors is to compute new variables on the between-subjects contrasts, and then run those through the appropriate between-subjects ANOVA with the appropriate contrasts there. That’s trivial in this case because there are only two levels of the between-subjects variable. The code is thus pretty easy:


COMPUTE T1 = MEAN(t1d1, t1d2, t1d3).
COMPUTE T2 = MEAN(t2d1, t2d2, t2d3).
COMPUTE T3 = MEAN(t3d1, t3d2, t3d3).
COMPUTE T2vs13 = T1 - 2*T2 +T3.
EXECUTE.
ONEWAY T2vs13 BY Group.

And the output is straightforward:

Interaction Contrast Output

We reject the null here and conclude the contrast characterizes the interaction successfully.

Interaction Contrasts in R, Wide Format

It turns out that in wide format, the way to do this in R looks a lot like SPSS. The main difference is that there are ways to shortcut this in R with matrix operations. I’ll show the most SPSS-like way of doing it and the matrix version to better satsify the R purists (and myself, as this is much cleaner). Here’s the SPSS-like code:


T1 = (T1D1 + T1D2 + T1D3)/3
T2 = (T2D1 + T2D2 + T2D3)/3
T3 = (T3D1 + T3D2 + T3D3)/3
T2vs13 = T1 - 2*T2 +T3
summary(aov(T2vs13 ~ GroupF))

And here’s the vector-multiplicaiton way:


LVec = rep(c(1, -2, 1), 3)
T2vs13 = cbind(T1D1, T2D1, T3D1, T1D2, T2D2, T3D2, T1D3, T2D3, T3D3) %*% LVec
summary(aov(T2vs13 ~ GroupF))

They both produce the same output:

            Df Sum Sq Mean Sq F value Pr(>F)   
GroupF       1   7353    7353   10.35 0.0062 **
Residuals   14   9943     710                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

And, hooray, it matches SPSS. Excellent.

Interaction Contrasts in R, Long Format

This simply does not work in R, as far as I can tell. This is exactly the same problem that I run into with the pure repeated-measures version, and I still have no idea what the problem is or why it occurs. The most frustrating part is that R doesn’t produce any error messages or anything, it just fails to put things in the output. It’s bizarre, annoying, and makes me mistrust R, which is not where you want to be with your statistics package.

Advertisements
Categories: Uncategorized Tags: , ,
  1. neuropsychdiary
    2016.12.06 at 11:00

    A useful post, thanks. I have the same problem with SPSS and R giving different H-F epsilons. Do you have any explanation for this? Which do you think is correct?

    Thanks,

    George

    • sunbyrne
      2016.12.06 at 13:27

      This is a known bug in SPSS; the value R provides is correct.

      -Mike

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: