Home > Uncategorized > Translating SPSS to R: Factorial Repeated Measures

Translating SPSS to R: Factorial Repeated Measures


As I noted in the last blog post, R is not terribly well-situated for repeated measures. This becomes increasingly true as the designs become more complicated. This post will cover a simple two-way repeated-measures ANOVA, so it won’t be all that bad, but even here there will start to be a few complications in places.

The data, which can be found in wide format in elashoff.xls, is a “famous” data set that has appeared in numerous textbooks from Elashoff (1981) “Data for the panel session in software for repeated measures analysis of variance.” It is actually a mixed design (there’s a between-subjects variable), but I’m going to ignore the between-subjects for now in the name of pedagogy. The within-subjects part of the study is a 3 x 3 design with three different types of drugs given at three different dosages. My vague recollection is that the context is that these are stimulants and the dependent measure is some measure of alertness.

The data file has 11 variables in it: the subject number, the group number (for the between-subjects group, which we’re going to ignore for now), and then the 9 observations on each subject. These are noted “TxDx” for “type x” and “dose x” respectively, so “T2D3” is drug type 2 at dosage level 3.

The results look like this:


(No, I didn’t use either SPSS or R to make the graph. I hate both of them for graphing for totally different reasons. That graph was made with DeltaGraph.)

Basic Factorial ANOVA

SPSS has the virtue of making easy things easy. (This is accompanied by sometimes making things of moderate difficulty hard to impossible and hard things unthinkable, but I digress.) The SPSS code to do a simple two-way repeated-measures ANOVA on these data is simply this:

/WSFACTOR Type 3, Dose 3.

There are two features of import here. First, SPSS offers the convenience of not listing out all the variables used, allowing the use of “TO” to indicate a list from one variable to another. Second, the WSFACTOR subcommand that creates the within-subjects independent variables. These names are order-dependent, and listing them in the wrong order will cause SPSS to analyze the data incorrectly, but without issuing any kind of error message. The good news is SPSS provides some output to give the user a fighting chance to catch if this was done wrong:


This table shows how each level of each independent variable is mapped to each variable in the data file. Very handy. Next in the output is a bunch of multivariate tests, which I always ignore, and then the sphericity diagnostics, as per the one-way output, but of course now there are more of them:


The data are pretty non-spherical, particularly things on the “Type” factor. Looks like GG adjustment is appropriate on Type and the interaction, and HF on the Dose effect. Those effects are all shown in the next (fairly large) table:


Looks like both main effects are reliable and the interaction is not. Pretty unsurprising given the graph. As noted last time, SPSS also produces output for contrasts, defaulting to assuming that all factors are amenable to trend analysis. For pedagogical reasons, I rather wish SPSS wouldn’t do this, as students frequently look at this and try to report something from it even when it’s completely irrelevant. (In this particular case it might be relevant for “Dose” but is clearly meaningless for “Type” since that’s obviously categorical.) Anyway, here’s what that looks like:


Whatever. Not my favorite, but not too damaging. Now, let’s see this in R, in both wide and long format.

Basic ANOVA in R, Wide Format
Not surprisingly, this is more complicated, though it’s not really all that bad for a two-way design. The basic issue in generating the code is that that SPSS generates the factorial design for you, and with R you kind of have to build that yourself (though I suspect there’s an easier way to do this, I haven’t found it yet). Here’s the code:

## Create a model of the within-subjects factors
Anv1 = lm(cbind(T1D1, T1D2, T1D3, T2D1, T2D2, T2D3, T3D1, T3D2, T3D3)~1)

## Create labelled factors for w-s factors
Type = factor(c(“T1”, “T1”, “T1”, “T2”, “T2”, “T2”, “T3”, “T3”, “T3”))
Drug = factor(c(“D1”, “D2”, “D3”, “D1”, “D2”, “D3”, “D1”, “D2”, “D3”))

## shorter, but equivalent, way to do this
Type = factor(c(rep(“T1”, 3), rep(“T2”, 3), rep(“T3”,3)))
Dose = factor(rep(c(“D1”, “D2”, “D3”), 3))

## do the ANOVA. “idata” and “idesign” tell Anova() how to
## structure within-subjects factors
Anv2 = Anova(Anv1, idata=data.frame(Type, Dose), idesign=~Type*Dose, type=3)
summary(Anv2, multivariate=F)

This produces output that isn’t quite as clean as SPSS, but does match it in all the important ways:

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

               SS num Df Error SS den Df         F    Pr(>F)    
(Intercept) 79900      1   816.89     15 1467.1600 2.226e-16 ***
Type          426      2  1009.40     30    6.3294   0.00509 ** 
Dose          878      2   652.28     30   20.1997 2.775e-06 ***
Type:Dose      46      4   580.93     60    1.1809   0.32836    
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Mauchly Tests for Sphericity

          Test statistic  p-value
Type             0.17795 0.000006
Dose             0.69506 0.078368
Type:Dose        0.19903 0.010585

Greenhouse-Geisser and Huynh-Feldt Corrections
 for Departure from Sphericity

           GG eps Pr(>F[GG])    
Type      0.54883    0.02041 *  
Dose      0.76632  2.888e-05 ***
Type:Dose 0.54621    0.32290    
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

             HF eps   Pr(>F[HF])
Type      0.5597129 1.973144e-02
Dose      0.8361743 1.430901e-05
Type:Dose 0.6429432 3.258431e-01

The thing I really miss is the diagnostic about levels—if you mess this up, you get the wrong output and no way of knowing this happened. However, not a surprise that SPSS is more user-friendly than R on pretty much any score (except price, of course, and speed).

Basic ANOVA in R, Long Format
Step 1 is to turn the wide-format data into long-format data. This is not terribly hard in general, but there are some complications. In particular, for multi-factor one-way designs, the variable that holds the condition holds the full condition name, and that has to be broken down into separate variables. I usually do this by just extracting substrings, but this only works if you’re disciplined in your naming conventions. I am, but not everyone is, so when I get data from other people, it’s almost always extra work, but it’s not generally too bad. Anyway, here’s the R code:

## melt into long form
stim.lng = melt(stim, id=c("Snum", "Group"), variable.name = "Cond", value.name="Alrtns")

## code separate factors for Type and Dose
stim.lng$TypeF = factor(substr(stim.lng$Cond, 1, 2))
stim.lng$DoseF = factor(substr(stim.lng$Cond, 3, 4))

## Make Snum into a factor
stim.lng$Snum = factor(stim.lng$Snum)

## run ANOVA
ezANOVA(stim.lng, dv = Alrtns, wid = Snum, within = .(TypeF, DoseF), detailed=T)

And here’s the output:

       Effect DFn DFd         SSn       SSd           F            p p>.05        ges
1 (Intercept)   1  15 79900.44444  816.8889 1467.159956 2.226286e-16     * 0.96312076
2       TypeF   2  30   425.93056 1009.4028    6.329444 5.089509e-03     * 0.12220314
3       DoseF   2  30   878.38889  652.2778   20.199727 2.775105e-06     * 0.22306086
4 TypeF:DoseF   4  60    45.73611  580.9306    1.180936 3.283603e-01       0.01472871

$`Mauchly's Test for Sphericity`
       Effect         W            p p>.05
2       TypeF 0.1779497 5.650424e-06     *
3       DoseF 0.6950563 7.836821e-02      
4 TypeF:DoseF 0.1990287 1.058467e-02     *

$`Sphericity Corrections`
       Effect       GGe        p[GG] p[GG]>.05       HFe        p[HF] p[HF]>.05
2       TypeF 0.5488323 2.040935e-02         * 0.5597129 1.973144e-02         *
3       DoseF 0.7663166 2.887647e-05         * 0.8361743 1.430901e-05         *
4 TypeF:DoseF 0.5462132 3.229028e-01           0.6429432 3.258431e-01          

Because in long form there are no order issues, the problem of mis-ordering the independent variables doesn’t come up, so no worries on that front. I still wish it did the d.f. adjustment for sphericity, but I’ll live.

Simple Main Effects

First, yes, I understand that’s not the terminology of choice for everyone. I discussed that in my post on factorial between-subjects ANOVA. I’m sticking with it. I also get that this isn’t always a great way of decomposing an interaction (In fact, it’s often seriously flawed), but I’m still going to illustrate it for completeness. I am not, however, going to do cell posthocs because they’re awful.

Second, I further understand that that is not really the optimal data set for illustrating this, because there isn’t really any evidence of an interaction. An oversight on my part, perhaps, but it’s a pretty well-known data set—let’s just go with these are the procedures that would be used were there a meaningful interaction, and all agree that they aren’t really appropriate for these data and give that a pass for pedagogical purposes. OK?

Simple Main Effects in SPSS
Unfortunately, the SPLIT… SORT construct that makes this easy to do in between-subjects ANOVA doesn’t work for repeated-measures data in SPSS. It’s not that bad, but it’s sort of tedious. Let’s say we wanted to look for the effects of Dose at the different levels of Type of drug. That would look like this:

GLM T1D1 T1D2 T1D3

GLM T2D1 T2D2 T2D3

GLM T3D1 T3D2 T3D3

In each individual ANOVA, only a subset of the nine measures is used, and at each one, we test for an effect of Dose. Since there’s no interaction and a substantial main effect of Dose, we expect to reject the null in all three cases, and that’s what we get (in three pieces). Here’s the output for drug Type 1:


And for drug Type 2:


And for drug Type 3:


Obviously, if we were decomposing an interaction we’d be looking for some of these ANOVAs telling us to reject and others not.

Simple Main Effects in R, Wide Format
The basics for doing this in R are exactly the same as they are for doing this in SPSS—we do three separate ANOVAs by including only some of the variables in each one.

Dose = factor(c("D1", "D2", "D3"))

T1Anv = lm(cbind(T1D1, T1D2, T1D3)~1)
summary(Anova(T1Anv, idata=data.frame(Dose), idesign=~Dose, type=3), multivariate=F)

T2Anv = lm(cbind(T2D1, T2D2, T2D3)~1)
summary(Anova(T2Anv, idata=data.frame(Dose), idesign=~Dose, type=3), multivariate=F)

T3Anv = lm(cbind(T3D1, T3D2, T3D3)~1)
summary(Anova(T3Anv, idata=data.frame(Dose), idesign=~Dose, type=3), multivariate=F)

Same idea as SPSS, but wordier. Here’s the output in one big chunk:

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

                 SS num Df Error SS den Df       F    Pr(>F)    
(Intercept) 23585.3      1   144.67     15 2445.48 < 2.2e-16 ***
Dose          386.8      2   277.21     30   20.93  2.04e-06 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Mauchly Tests for Sphericity

     Test statistic    p-value
Dose        0.28979 0.00017161

Greenhouse-Geisser and Huynh-Feldt Corrections
 for Departure from Sphericity

      GG eps Pr(>F[GG])    
Dose 0.58472  0.0001503 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

        HF eps   Pr(>F[HF])
Dose 0.6041376 0.0001227022

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

               SS num Df Error SS den Df      F    Pr(>F)    
(Intercept) 32396      1  1360.31     15 357.23 7.147e-12 ***
Dose          384      2   318.62     30  18.08 7.046e-06 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Mauchly Tests for Sphericity

     Test statistic p-value
Dose        0.91258 0.52712

Greenhouse-Geisser and Huynh-Feldt Corrections
 for Departure from Sphericity

      GG eps Pr(>F[GG])    
Dose 0.91961  1.466e-05 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

       HF eps  Pr(>F[HF])
Dose 1.042018 7.04624e-06

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

                 SS num Df Error SS den Df         F   Pr(>F)    
(Intercept) 24345.0      1   321.31     15 1136.5114 1.48e-15 ***
Dose          153.3      2   637.38     30    3.6076  0.03945 *  
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Mauchly Tests for Sphericity

     Test statistic p-value
Dose        0.75338 0.13775

Greenhouse-Geisser and Huynh-Feldt Corrections
 for Departure from Sphericity

      GG eps Pr(>F[GG])  
Dose 0.80217    0.05146 .
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

        HF eps Pr(>F[HF])
Dose 0.8834696  0.0461275

Same answers as SPSS, of course, which is good. Now, for long-format, which is actually both harder and easier, depending on your background.

Simple Main Effects in R, Long Format
So, assuming the data are in a frame called “stim.lng” as in the example above, the way I think of doing this is in a for() loop. Essentially, this is exactly the same as the between-subjects version of this, which is great if you’re already comfortable with the idea of a for() loop, which not all psychology students are. Be that as it may, here’s a very simple loop which not only gets the job done but makes nice labels:

for (Type in levels(stim.lng$TypeF)) {
cat("\n------ Drug Type:", Type, "-----\n")
print(ezANOVA(stim.lng[stim.lng$TypeF==Type,], dv=Alrtns, wid=Snum, within=.(DoseF), detailed=T))

And here’s the output:

------ Drug Type: T1 -----
       Effect DFn DFd        SSn      SSd          F            p p<.05       ges
1 (Intercept)   1  15 23585.3333 144.6667 2445.48387 4.964361e-18     * 0.9824272
2       DoseF   2  30   386.7917 277.2083   20.92966 2.039732e-06     * 0.4783079

$`Mauchly's Test for Sphericity`
  Effect         W            p p<.05
2  DoseF 0.2897867 0.0001716125     *

$`Sphericity Corrections`
  Effect       GGe        p[GG] p[GG]<.05       HFe        p[HF] p[HF]<.05
2  DoseF 0.5847224 0.0001502653         * 0.6041376 0.0001227022         *

------ Drug Type: T2 -----
       Effect DFn DFd        SSn      SSd         F            p p<.05       ges
1 (Intercept)   1  15 32396.0208 1360.312 357.22697 7.147465e-12     * 0.9507281
2       DoseF   2  30   384.0417  318.625  18.07964 7.046240e-06     * 0.1861588

$`Mauchly's Test for Sphericity`
  Effect         W         p p<.05
2  DoseF 0.9125831 0.5271171      

$`Sphericity Corrections`
  Effect       GGe        p[GG] p[GG]<.05      HFe       p[HF] p[HF]<.05
2  DoseF 0.9196105 1.465652e-05         * 1.042018 7.04624e-06         *

------ Drug Type: T3 -----
       Effect DFn DFd        SSn      SSd          F            p p<.05       ges
1 (Intercept)   1  15 24345.0208 321.3125 1136.51138 1.480283e-15     * 0.9621128
2       DoseF   2  30   153.2917 637.3750    3.60757 3.944790e-02     * 0.1378548

$`Mauchly's Test for Sphericity`
  Effect        W         p p<.05
2  DoseF 0.753376 0.1377471      

$`Sphericity Corrections`
  Effect       GGe      p[GG] p[GG]<.05       HFe     p[HF] p[HF]<.05
2  DoseF 0.8021665 0.05145518           0.8834696 0.0461275         *

Note the identical F-values to the other two ways of doing this. However, this seems much easier to me than writing out three separate ones and scales better to larger designs. Of course, simple main effects aren’t my preferred way of dealing with this problem in the first place, and that’s where things start go get dicey.

Interaction Contrasts

Again, no, there’s not an interaction here. However, for pedagogical reasons, I’m going to go through interaction contrasts anyway, because this is where getting SPSS and R to produce the same answer is a little more complicated. Actually, with interaction contrasts, getting either package to produce an answer is not as straightforward as it might be, but both support it.

So what interaction contrast do we want? Since there isn’t really an interaction, it doesn’t matter much. Let’s say the experiment had come out differently, and not only was drug Type 2 better, but that interacted with Dose—let’s say the dose-response function was much steeper for drug Type 2. We’d want a linear contrast on dosage, which is (1 0 -1) on Dose, and we want to compare drug Type 2 with the average of the other two groups, which is (1 -2 1) on Dose. We want the outer product of those across the nine means—an interaction contrast. So, how do we make that happen in the two packages?

Interaction Contrasts in SPSS
In between-subjects ANOVA, SPSS’s support for interaction contrasts is utterly awful, as I mentioned in the blog post where I covered it. The good news is that it’s better for within-subjects ANOVA. It’s still not wonderful, but at least SPSS will do the cross-multiplication for you here (which it won’t for between-subjects ANOVA—don’t get me started). For a within-subjects ANOVA, we have to tell SPSS what contrasts we want, and if we only want one contrast on each, the easiest way to keep everything straight is to just repeat those contrasts in each of the statements. For the contrast we want, the syntax looks like this:

1 -2 1
1 -2 1)
Dose 3 SPECIAL(1 1 1
1 0 -1
1 0 -1).

Kind of ugly. Note that repeated the contrasts twice isn’t necessary in the sense that if you wanted to put in other contrasts besides just the one I specified above, you could. You cannot, however, just put in the one you want and be done—SPSS requires that the matrix you specify on each factor be k by k, where k is the number of levels of the variable, and the first entry in the matrix must be all 1s. Why this is required is unclear. Anyway, here’s the relevant part of the output:


And, as expected, the test fails with a p-value of .455. Not a surprise since there isn’t really an interaction—but this is how you might break one down with contrasts.

Note that just as with one-way repeated-measure ANOVA, this assumes that each contrast gets its own error term, and thus we need not assume sphericity here, but we do lose some degrees of freedom in the error term. If this is a problem, then it’s possible to use the MSE from the original ANOVA and adjust the sum of squares for the contrast by dividing it by the squared vector length to produce an appropriate F. This is annoying to do, and it would be nice if SPSS gave the option to use the pooled error term for those who want it.

Interaction Contrasts in R, Wide Format
As noted in my previous post on one-way repeated measures ANOVA, R has good tactilities for vector (and matrix) multiplication, and that’s probably the easiest way to do contrasts in wide-format data: multiply the data by the contrast you want, and do a t-test on the resulting variable. The same strategy applies here, only now you need to take the outer product of the vectors on the two different independent variables and multiply your data matrix by the result of that. It sounds cumbersome but it’s really not that bad.

Remember that we have (1 0 -1) on Dose and (1 -2 1) on Type. We’ll take the outer product of those and just multiply by an array created by putting together all the relevant variables:

## build the contrast vector. Contrast on Dose goes first, because
## Dose changes fastest (opposite order of SPSS!)
LVec = c(c(1, 0, -1) %o% c(1, -2, 1))

## Multiply
L = cbind(T1D1, T1D2, T1D3, T2D1, T2D2, T2D3, T3D1, T3D2, T3D3) %*% LVec

## Do the test.

And this is what you get:

One Sample t-test

data: L
t = 0.7663, df = 15, p-value = 0.4554
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-4.11939 8.74439
sample estimates:
mean of x

Remember that SPSS produced F(1, 15) = 0.59, p = .455, so this does indeed produce the same result as SPSS, and rests on the same set of assumptions (no sphericity required, but no pooling of error d.f.).

Interaction Contrasts in R, Long Format
As far as I can tell, if you do it this way, you cannot get the same results out that you get from SPSS. There may well be some trick here that I don’t know, but so far I have been unable to get R to do this without using the pooled error term, and therefore requiring the assumption of sphericity. Whether that’s a good or a bad thing is not something I’m going to debate here, I just want to be up-front about what’s going on.

Furthermore, for reasons that I have not yet been able to determine, this method sometimes (frequently) completely fails, and R provides no information about why. I’ll admit that I’m a little flummoxed here and if there are any R folks out there who can help, I would most definitely appreciate the assistance.

The basic idea is to set up contrasts on the factors, run the ANOVA, and then ask for the summary to split the output according to those contrasts. The ezANOVA() procedure will kindly return the aov object in order to facilitate this, and then you can call summary() on that. It looks like this:

## set up the factors with the contrasts we want, and label them
contrasts(stim.lng$TypeF) = cbind(c(1, -2, 1))
contrasts(stim.lng$DoseF) = contr.poly(3)
TypeLb = list("2vs1&3"=1)
DoseLb = list("Linear"=1, "Quad"=2)

## run ANOVA with ez, but save out the result
ezANOVA(stim.lng, dv = Alrtns, wid = Snum, within = .(TypeF, DoseF), detailed=T)
AnvE = ezANOVA(stim.lng, dv = Alrtns, wid = Snum, within = .(TypeF, DoseF), return_aov=T)$aov
summary(AnvE, split=list(TypeF=TypeLb, DoseF=DoseLb))

This gives us results that actually match SPSS in terms of sums of squares for the main effect contrasts, but then, for no reason this is provided in the output, the interaction contrast simply do not appear at all:

Error: Snum
          Df Sum Sq Mean Sq F value Pr(>F)
Residuals 15  816.9   54.46               

Error: Snum:TypeF
                Df Sum Sq Mean Sq F value  Pr(>F)   
TypeF            2  425.9   213.0   6.329 0.00509 **
  TypeF: 2vs1&3  1  422.9   422.9  12.569 0.00131 **
Residuals       30 1009.4    33.6                   
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: Snum:DoseF
                Df Sum Sq Mean Sq F value   Pr(>F)    
DoseF            2  878.4   439.2  20.200 2.78e-06 ***
  DoseF: Linear  1  876.0   876.0  40.291 5.30e-07 ***
  DoseF: Quad    1    2.3     2.3   0.108    0.745    
Residuals       30  652.3    21.7                     
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: Snum:TypeF:DoseF
                             Df Sum Sq Mean Sq F value Pr(>F)
TypeF:DoseF                   4   45.7  11.434   1.181  0.328
  TypeF:DoseF: 2vs1&3.Linear  1                              
  TypeF:DoseF: 2vs1&3.Quad    1                              
Residuals                    60  580.9   9.682   

Again, I’m not sure why, but this is another symptom of poor support for repeated measures in R. I’m positive that there is some way to get R to produce the relevant numbers—probably by computing all the relevant sums of squares manually—but I highly doubt your average first-year psychology graduate student is likely to be willing to go that route if the SPSS alternative is available.

Categories: Uncategorized Tags: , ,
%d bloggers like this: