One Sample T Tests - Aidan Helfant's Digital Garden

up:: [[Statistics MOC]] x:: [[Z scores]] Tags:: # One Sample T Tests #### The problems with the z score [[Variance and Standard Deviation#Z Scores|Z scores]] are fantastic. They allow us to use [[Statistical Bias#^1b066a|inferential statistics]] to make conclusions about a population from a sample. The issue is that they require us to know much more about the population than we normally would in any given circumstance. We need to know the population mean and the population standard deviation, two things we wouldn't usually know. T scores solve this issue. ### What is the difference between the t statistic and z statistic? A t statistic is calculated when we don't know the population standard deviation or mean. Instead we use the **estimated standard error** of the mean using the sample variance or sample standard deviation with this equation: ![[Pasted image 20220921165637.png|300]] This equation requires that we calculate the sample variance or the sample standard deviation discussed in my note on [[Variance and Standard Deviation]]. They are calculated with these equations: ![[Pasted image 20220921165919.png|600x200]] Once we have the estimated standard error of the mean we can calculate the t statistic using a slightly modified version of the z score formula. ![[Pasted image 20220921165736.png|300]] #### Why do we use variance in the t score equation? 1. Sample variance with the sample variance equation is an [[Variance and Standard Deviation#^bff1ee|unbiased statistic]]. 2. Variance is used in equations with the t statistic in other parts of statistics making it easier to make it fit in all the equations involving the t statistic. ## What is the t distribution? Similar to the z score distribution, the t distribution is the distribution of every possible t score computed for every possible random sample of sample size (n) or degrees of freedom (df). As the sample size gets larger the t distribution moves closer and closer to a normal curve because N and the df are almost exactly the same. It's when the sample size is small that makes the t distribution is different from the normal distribution. ### Hypothesis Testing with the Students t test ### Step 1: define your H0, H1 and boundaries The null hypothesis states that the treatment had absolutely no effect. In other words any measured differences between the sample and population are sampling error. It's denoted by the symbol H0. In the proposing of the null hypothesis you also propose the predicted population mean μ. The alternative hypothesis states that there is a change in the population from the treatment effect. its symbol is H1. To reject the null hypothesis we must define what boundaries of chances we deem as unlikely enough to say the null hypothesis isn't true. The boundary we select is called the alpha criterion (α). Generally for the alternative hypothesis to be proven true the alpha criterion must be less than .05 or .01 ### Step 2: define the tcrit values Find your df by taking the sample size and subtracting 1. Then use the t unit table to find the tcrit values associated with the alpha value. Remember to take into account if it's a one tailed or a two tailed test. ### Step 3: calculate your t statistic First, calculate your sample variance: ![[Pasted image 20220922161710.png]] Then calculate the estimated standard error: ![[Pasted image 20220921165637.png|300]] Than calculate the t statistic: ![[Pasted image 20220921165736.png|300]] ### Step 4: make a decision If your t statistic falls outside of the range of critical values than you have statistically significant evidence to reject the null hypothesis. In addition two things must check out for a t statistic to be possible: 1. Your observations must be independent of each other. 2. The population being sampled from must be normal. This is especially true if the sample size is relatively small but the rule can be effectively ignored if your sample size is extremely large. ### How do degrees of freedom effect the t statistic? Degrees of freedom, df, is the same as n - 1. The larger the degrees of freedom for the sample the better the sample variance approximates the population variance and the better the sample follows the t score distribution. As n gets smaller the t distribution gets flatter and more broad. #### How do sample variance and sample size effect the t statistic? The larger the sample variance the bigger the t statistic will be and therefore the harder to prove the null hypothesis wrong. However, the larger the sample size and the smaller the t statistic will be and therefore the easier to prove the null hypothesis wrong. ### Effect Size with one sample t tests ### Cohen's d Finding Cohen's d is different with t statistics. Because we don't have the population standard deviation we instead calculate for the estimated Cohens d. ![[Pasted image 20220922154256.png|700x200]] The effect sizes are not significantly altered by the sample size but they are altered by the variance. ###### Measuring effect size based on cohen's d number ![[Pasted image 20220929152322.png]] ### r^2 [[Effect size#r 2|r^2]] is another method of showing the treatment effect that is the square root of r commonly known as the correlation coefficient. ### Confidence Intervals [[Confidence intervals]] measures a range of values that we are some confidence for include the true population mean. Equation: ![[Pasted image 20220922155432.png|600]] ___ ### Rstudio for calculating the T statistic pt() and qt() serve the same function as pnorm and qnorm but for t distributions (though you must of course supply these functions with the degrees of freedom so as to identify the correct t distribution) ![[Pasted image 20220926153726.png]]