Hypothesis Testing - Aidan Helfant's Digital Garden

### What is Hypothesis Testing? Hypothesis testing is a statistical method for making inferences about a population from sample data. It is an [[Statistical Bias#^1b066a|inferential process]] as it uses descriptive statistics to make inferences about a population. This is extremely useful because in studies it's often very hard to get data from an entire population. It's much easier to use sample data to make inferences on the population. The goal of a hypothesis test is to determine whether a treatment will have any effect on a population. ## The Procedure for Hypothesis Testing ### Step 1: State the Null and Alternative Hypothesizes The null hypothesis states that the treatment had absolutely no effect. In other words any measured differences between the sample and population are sampling error. It's denoted by the symbol H0. In the proposing of the null hypothesis you also propose the predicted population mean μ. The alternative hypothesis states that there is a change in the population from the treatment effect. its symbol is H1. ### Step 2: Define the Boundaries To reject the null hypothesis we must first define what boundaries of chances we deem as unlikely enough to say the null hypothesis isn't true. The boundary we select is called the alpha criterion (α). Generally for the alternative hypothesis to be proven true the alpha criterion must be less than .05 or .01 which corresponds with a [[Variance and Standard Deviation#^eb8a36|z score]] or zcrit (critical values of z) of +-1.96 on a two tailed [[Probability#^0dcb22|normal distribution]]. One tailed tests for .05 alpha values correspond to a z score of 1.65. Two tailed tests for a .01 alpha value correspond to a z score of +- 2.58. Visual representation of a two tailed test: ![[Pasted image 20220920075720.png]] > "One tailed tests are for scoundrels and miscreants." - Thomas Cleland The only time a one tailed test would be excusable is if there was a prior justification for a directional prediction. This would have to be proven by prior charts or analysis. The critical region comprises the values on the distribution that correspond to the null hypothesis being rejected. ![[Pasted image 20220913091959.png]] The ==z score== is to the ==zcrit== as the ==p-value== is to the ==alpha criterion.== ![[Pasted image 20220914153317.png]] #### P-Values The [[p-value]] is the probability of obtaining your test statistic or a more extreme result given that the null hypothesis is true. You can also find the p-value or probability of a z score using the unit table. For example, find the p-value of z = +- 1.55: ![[Pasted image 20220914154134.png]] In this example the proportion comes out to .935. 1 - .935 = .065. But because this is a two tailed test we have to double the proportion meaning the p-value corresponding to a z score of +- 1.55 is .13. ### Step 3: Collect the Data and Compute Values *After* we have posed our null and alternative hypothesis it's time to collect the data. Then we compute the likelihood we would have gotten our sample mean if the null hypothesis was true by finding the z score or test statistic of our given sample mean using this equation: ![[Pasted image 20220913092350.png]] Another way of saying this equation is: ![[Pasted image 20220914152251.png]] Remember that the standard error of the mean is derived by the equation σ/sqrt(n). ### Step 4: Make a Decision After we have found the z score for the sample mean it's time to make a decision using our previously defined alpha value. If the z score we found is inside the critical range, than we have *statistically significant* evidence to reject the null hypothesis and state their is a significant difference caused by the treatment. ### Two Types of Hypothesis Failures Unfortunately there are times where our hypothesis testing will be wrong. This is the nature of using a sample to make inferences on a larger population. There are two types of errors we can make. **Type 1 Errors:** Type 1 errors occur when we reject a null hypothesis that was actually true. In other words, we say the treatment had an effect on a population when it actually didn't. The probability of a type 1 error is the same as the alpha criterion. They are much worse than type 2 errors. Type 1 errors are not effected by sample size. **Type 2 Errors:** Type 2 errors occur when we fail to reject a null hypothesis that was actually wrong. In other words, we say the treatment had no effect when it actually did. It's harder to find the probability of a type 2 error but it's denoted by the function β. ## Statistical Power [[Statistical Power]] is the probability that the null hypothesis is rejected if there is a treatment effect. It's the same thing as 1 - β or the probability of a Type 1 error because that is the probability the null hypothesis is not rejected when it's false.