A chi-square test for independence is used to test for [[independent|independence]]. Using a [[contingency table]], compare the actual counts with the expected counts under the assumption that the data are independent. Using the same test statistic as in a [[chi-square goodness of fit test]], conduct [[hypothesis testing]] to determine whether to reject the null hypothesis. The degrees of freedom is equal to the number of rows - 1 times the number of columns - 1. ## R Use `chisq.test()` for the chi-square test for independence. ```R # Generate data dogs <- c(41, 76, 80, 38) cats <- c(28, 54, 56, 27) data <- cbind(dogs, cats) chisq.test(data) ``` The result will be: ``` Pearson's Chi-squared test data: data X-squared = 0.019791, df = 3, p-value = 0.9993 ``` - `X-squared`: value of the chi-squared test statistic - `df`: degrees of freedom - `p-value`: p-value Let's try to get the same result manually. ```R # Generate data dogs <- c(41, 76, 80, 38) cats <- c(28, 54, 56, 27) data <- cbind(dogs, cats) # Calculate total observations total_observations <- sum(data) # Create matrix of expected data expected <- outer(rowSums(data), colSums(data)) / total_observations # Calculate chi-squared test statistic x_squared <- sum((data - expected)**2 / expected) # Calculate p-value df <- (ncol(data) - 1) * (nrow(data) - 1) pval <- pchisq(x_squared, df=df, lower.tail=FALSE) # Print results print(paste("X-squared:", round(x_squared, 6))) print(paste("df:", df)) print(paste("p-value:", round(pval, 4))) ``` The results will be the same. ```R "X-squared: 0.019791" "df: 3" "p-value: 0.9993" ``` # chi-square test for homogeneity A chi-square test for homogeneity is used to test whether the distribution of attributes in a stratified random sample is homogenous. Using a [[contingency table]], compare the actual counts with the expected counts under the assumption that the data are homogenous. Using the same test statistic as in a [[chi-square goodness of fit test]], conduct [[hypothesis testing]] to determine whether to reject the null hypothesis. Whether to use a test for homogeneity or test for independence depends on how the random sample was selected. If you selected a simple random sample and then categorized them, you can test for independence among the classes. If you selected a stratified random sample, you can test whether the distribution within the classes is homogenous.