up: [[Independent Measure T Tests]]
# Welch's t test
Welch's t test allows us to undergo a t test without needing to have [[Homogeneity of variance]] like [[Student's t test]].
However it does require the sample sizes to be the same with robustness to a little variation in sample sizes.
Unlike the students t test where df equals n1 + n2 - 2 we must use this equation to find the degrees of freedom:
![[Pasted image 20221003150340.png]]
We use these rounded down degrees of freedom to find the tcrit value on the [[T Unit Table]].
Finding the t statistic works exactly the same way as [[Independent Measure T Tests|explained here.]]
Related:
### Rstudio
```r
# Independent-measures t-test examples
# ... with repeated-measures t-test illustration following
# PSYCH 2500
samp50 = read.csv("Samples50.csv")
#ctrl = scan("UntreatedSample_50.txt")
#treated = scan("TreatedSample_50.txt")
hist(samp50$ctrl, col=3,xlim=c(100,180),ylim=c(0,15),breaks=seq(100,180,5))
abline(v=mean(samp50$ctrl),lty=1,col="green")
hist(samp50$treated,col=2,xlim=c(100,180),ylim=c(0,15),breaks=seq(100,180,5)) # Axes and bins forced to be the same as for CTRL
abline(v=mean(samp50$ctrl),lty=1,col="green")
abline(v=mean(samp50$treated),lty=1,col="red")
# Student's t-test (uses pooled variance, Student's t-test bc var.equal=TRUE)
t.test(samp50$ctrl, samp50$treated, var.equal=TRUE)
# Welch's t-test (equal variances not assumed; this is the default)
# Note that the df is not an integer in this "Welch test"
t.test(samp50$ctrl, samp50$treated, var.equal=FALSE)
# Welch test again -- identical, but report 80% confidence interval instead of default 95%
# instead of default 95% CI
t.test(samp50$ctrl, samp50$treated, var.equal=FALSE, conf.level=0.80)
Notice that the t.test() function output is a data structure that comes out as formatted text:
> t.test(Samp.Tr, Samp.NZ)
Welch Two Sample t-test
data: Samp.Tr and Samp.NZ
t = 0.18882, df = 48.261, p-value = 0.851
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-7.153704 8.636813
sample estimates:
mean of x mean of y
24.24334 23.50179
We can take this apart if we want to. The str() command gives us a hint about how to do so:
> Output = t.test(Samp.Tr, Samp.NZ)
> str(Output)
List of 10
$ statistic : Named num 0.189
..- attr(*, "names")= chr "t"
$ parameter : Named num 48.3
..- attr(*, "names")= chr "df"
$ p.value : num 0.851
$ conf.int : num [1:2] -7.15 8.64
..- attr(*, "conf.level")= num 0.95
$ estimate : Named num [1:2] 24.2 23.5
..- attr(*, "names")= chr [1:2] "mean of x" "mean of y"
$ null.value : Named num 0
..- attr(*, "names")= chr "difference in means"
$ stderr : num 3.93
$ alternative: chr "two.sided"
$ method : chr "Welch Two Sample t-test"
$ data.name : chr "Samp.Tr and Samp.NZ"
- attr(*, "class")= chr "htest"
Lots of information. The basic idea is that we can use the same “$” notation that we use to pull
dataframes apart to pull this structure apart and get access to its individual parts. For example, to get
just the p‐value, we can say:
> Output$p.value
[1] 0.8510253
To get just the t statistic, we say:
> Output$statistic
t
0.1888206
This is actually an odd sort of number called a “named num” (check it with the str() function), meaning
that is a number that is labeled “t”. You can compute normally with it, but it will carry that “t” attribute
around unless you “un‐name” it:
> unname(Output$statistic)
[1] 0.1888206
Now you have just the number itself (the value of the t statistic), and can do with it as you will.
```
___
# Resources