## Translating SPSS to R: t-tests

The first tests my students learn are actually using the binomial, then a permutation test, then testing means using the normal. We don’t use a statistics package for any of those, nor in fact do we use a statistics package for the first few *t*-tests that they do. But this is where things start to pick up, so this is where I try to ease them into doing tests on the computer.

Fortunately, SPSS and R aren’t all that much different when it comes to simple *t*-tests, at least not until you get into more subtle problems. The data set is this one: ttestex.xls.

**Independent Samples**

The basic SPSS code for a between-subjects *t*-test is this:

`TTEST`

/VAR Height

/GROUPS Gender(0,1).

and it produces output that looks like this:

The best thing about SPSS is that it automatically reports both the equal-variance test and the Welch-adjusted test. On the down side, the system is too dumb to realize that there are only two values for Gender and you still have to tell it which two values (zero and 1) you want.

Here’s the equivalent in R, requesting both equal and unequal variance versions:

`t.test(Height~Gender, var.equal=TRUE)`

t.test(Height~Gender, var.equal=FALSE)

and here’s the output:

`Two Sample t-test`

```
```data: Height by Gender

t = -2.5338, df = 19, p-value = 0.02024

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-7.4345615 -0.7082957

sample estimates:

mean in group 0 mean in group 1

64.00000 68.07143

Welch Two Sample t-test

`data: Height by Gender`

t = -2.5186, df = 11.915, p-value = 0.0271

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-7.5963387 -0.5465185

sample estimates:

mean in group 0 mean in group 1

64.00000 68.07143

The issue of unequal variances is something I’ll get back to later.

**Paired Samples**

Within-subjects (or “paired” or “matched” or “dependent” depending on your preference for language) is also pretty simple in both packages. For SPSS we have this:

`TTEST`

/PAIRS Initeff Finaleff.

That little blurb of syntax produces this output:

All pretty reasonable stuff. However, if someone at SPSS could explain to me why the columns here are in a different order than they are for the independent samples output, I’d really love to know. Is there any logic at all behind the change?

Anyway, the equivalent R:

`t.test(Initeff, Finaleff, paired=TRUE)`

And the output:

`Paired t-test`

```
```

`data: Initeff and Finaleff`

t = -13.8507, df = 20, p-value = 1.037e-11

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-8.876083 -6.552488

sample estimates:

mean of the differences

-7.714286

I do like that SPSS defaults to reporting the correlation, though this is pretty easy to extract in R anyway, it’s not a big deal. Note that in both packages, you could also compute a difference score variable and test whether the mean of that is zero. Here’s what that looks like in SPSS:

`COMPUTE Diff = Finaleff - Initeff.`

EXECUTE.

```
```

`TTEST`

/VAR diff

/TESTVAL 0.

I would like to take this moment to ask why the hell in SPSS it’s necessary to put “EXECUTE” after every simple COMPUTE operation. For the love of all that is good in the world, why? Anyway, here’s what you get for output:

All well and good. The equivalent R is of course more compact:

`Diff = Finaleff - Initeff`

t.test(Diff, mu = 0)

Here’s the output produced:

` One Sample t-test`

```
```

`data: Diff`

t = 13.8507, df = 20, p-value = 1.037e-11

alternative hypothesis: true mean is not equal to 0

95 percent confidence interval:

6.552488 8.876083

sample estimates:

mean of x

7.714286

Fantastic, everything matches up. Hey, with only two groups, it can’t be that complicated, can it?

Well, actually, both of these packages could do better. For one thing, neither package reports any measure of effect size, in this context usually **d** is reported. It’s not hard to compute, but why not report it? The other thing is that neither package does a good job with heteroscedasticity.

**Heteroscedasticity**

For the independent samples test, there is the assumption of equal variances to be considered. SPSS provides a Levene test of the equality of the variances in the two groups. In the example above, SPSS reported an F of .004 and a p-value of .953 for the test. To get the same test in R, you need to import the “lawstat” package so you have access to the “levene.test” function. That test actually defaults to a robust Brown-Forsythe Levene-type based on medians, which is not what SPSS uses. To get the same results as SPSS, you need to tell it to use the mean instead of the median. The R code looks like this:

`library(lawstat)`

levene.test(Height, Gender, location="mean")

and the output is this:

` classical Levene's test based on the absolute deviations from the mean ( none not applied because the location is not set to median )`

`data: Height`

Test Statistic = 0.0035, p-value = 0.9533

And that matches the SPSS output. Normally here’s where the R folks could say something snarky like “hah, we support a better test because we can test from the median, which is the default” but I think those folks are wrong anyway. Even the “improved” version of the Levene test is inferior to the O’Brien test for equality of variances (O’Brien, 1981), so that’s what I teach my students. The test involves transforming each data point and then testing the equality of the transformed data; this generally has both better Type 1 and Type 2 error rate than the Levene test (though see Moder, 2007 for a counterexample).

The formula for O’Brien’s r is this:

This, frankly, is a pain in the ass to do in SPSS, because SPSS doesn’t like to compute single variables like the mean or the variance of a variable. For this one I’ll use a different data set, raven.xls. What you have to do in SPSS is use descriptives to get out those values, then set up variables, then conditionally compute things… it’s really stupid. For this data set the dependent variable is called “Raven” and the IV is “Smoke” (smokers vs. non-smokers, I guess). So, first the descriptives:

`MEANS TABLES Raven BY Smoke /CELL MEAN COUNT VARIANCE.`

which produces this:

and those values can then be plugged in:

`COMPUTE m0 = 4.79.`

COMPUTE n0 = 14.

COMPUTE var0 = 10.335.

COMPUTE m1 = 4.86.

COMPUTE n1 = 7.

COMPUTE var1 = 2.476.

COMPUTE absdif = ABS(m0 - raven).

IF (smoke EQ 1) absdif = ABS(m1 - raven).

COMPUTE sqrdif = absdif*absdif.

COMPUTE obr = ((n0 - 1.5)*n0*sqrdif - 0.5*var0*(n0 - 1)) /

((n0 - 1) * (n0 - 2)).

IF smoke = 1

obr = ((n1 - 1.5)*n1*sqrdif - 0.5*var1*(n1 - 1)) /

((n1 - 1) * (n1 - 2)).

EXECUTE.

`T-TEST`

GROUPS=smoke(0 1)

/VARIABLES = obr.

Not exactly pretty, is it? Ultimately you get output that looks like this:

This is much times easier in R. In fact, in R, you could write a function to do this. I didn’t do it that way in an attempt to make this easier for students to understand—I’m not sure that actually helped, though, because I used some group indexing via the [] selector. Here’s the R code I provided:

`## compute vectors for means, ns, and variances`

Means = c(mean(Raven[Smoke == 0]), mean(Raven[Smoke == 1]))

Ns = c(length(Raven[Smoke == 0]), length(Raven[Smoke == 1]))

Vars = c(var(Raven[Smoke == 0]), var(Raven[Smoke == 1]))

```
```## compute absolute and squared difference scores

SqrDif = (Raven - Means[Smoke+1])^2

## compute O'Brien's R

ObR = (((Ns[Smoke+1]-1.5)*Ns[Smoke+1]*SqrDif) -

(0.5*Vars[Smoke+1]*(Ns[Smoke+1]-1))) /

((Ns[Smoke+1]-1)*(Ns[Smoke+1]-2))

`## Tests`

t.test(ObR~Smoke)

And that produces this output:

` Welch Two Sample t-test`

```
```

`data: ObR by Smoke`

t = 3.2414, df = 16.021, p-value = 0.005106

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

2.719638 12.998310

sample estimates:

mean in group 0 mean in group 1

10.33516 2.47619

It doesn’t perfectly agree with the SPSS output because of rounding in the SPSS computations—there should probably have been three decimal places in the means to get them to exactly match. However, the basic upshot is the same; both lead to the conclusion that the two variances are not, in fact, equal.