A chi-square test for independence is used to test for [[independent|independence]].
Using a [[contingency table]], compare the actual counts with the expected counts under the assumption that the data are independent. Using the same test statistic as in a [[chi-square goodness of fit test]], conduct [[hypothesis testing]] to determine whether to reject the null hypothesis. The degrees of freedom is equal to the number of rows - 1 times the number of columns - 1.
## R
Use `chisq.test()` for the chi-square test for independence.
```R
# Generate data
dogs <- c(41, 76, 80, 38)
cats <- c(28, 54, 56, 27)
data <- cbind(dogs, cats)
chisq.test(data)
```
The result will be:
```
Pearson's Chi-squared test
data: data
X-squared = 0.019791, df = 3, p-value = 0.9993
```
- `X-squared`: value of the chi-squared test statistic
- `df`: degrees of freedom
- `p-value`: p-value
Let's try to get the same result manually.
```R
# Generate data
dogs <- c(41, 76, 80, 38)
cats <- c(28, 54, 56, 27)
data <- cbind(dogs, cats)
# Calculate total observations
total_observations <- sum(data)
# Create matrix of expected data
expected <- outer(rowSums(data), colSums(data)) / total_observations
# Calculate chi-squared test statistic
x_squared <- sum((data - expected)**2 / expected)
# Calculate p-value
df <- (ncol(data) - 1) * (nrow(data) - 1)
pval <- pchisq(x_squared, df=df, lower.tail=FALSE)
# Print results
print(paste("X-squared:", round(x_squared, 6)))
print(paste("df:", df))
print(paste("p-value:", round(pval, 4)))
```
The results will be the same.
```R
"X-squared: 0.019791"
"df: 3"
"p-value: 0.9993"
```
# chi-square test for homogeneity
A chi-square test for homogeneity is used to test whether the distribution of attributes in a stratified random sample is homogenous.
Using a [[contingency table]], compare the actual counts with the expected counts under the assumption that the data are homogenous. Using the same test statistic as in a [[chi-square goodness of fit test]], conduct [[hypothesis testing]] to determine whether to reject the null hypothesis.
Whether to use a test for homogeneity or test for independence depends on how the random sample was selected. If you selected a simple random sample and then categorized them, you can test for independence among the classes. If you selected a stratified random sample, you can test whether the distribution within the classes is homogenous.