The **sharp null hypothesis** was presented by Fisher and states that the treatment effect of some intervention is zero *for all subjects*. I.e., $Y_{i}^1 = Y_{i}^0$ for all $i$. This differs significantly from the more standard null hypothesis which states that the *average treatment effect* is zero. More formally, $\mu_{Y^1} = \mu_{Y^0}$. The **sharp null hypothesis** can be useful because, by assuming it is accurate, we will know both $Y_{i}^1$ and $Y_{i}^0$ for all participants — as they are assumed to be equal. ### Randomization inference Given a single set of experimental outcomes, we can test the sharp null hypothesis by simulating all possible unit-to-experimental-group random assignments. In each simulation, we can calculate the treated vs. untreated mean outcome difference. This will give us all possible outcomes given the subjects in our study and a distribution of the possible outcomes we could have measured. Then we can calculate the probability of observing an outcome (the estimated ATE) as large as what we actually observed (i.e., the $p$-value) by dividing the number of simulations in which results match ours vs. the total number of possible outcomes. Below is an example from Gerber and Green, 2012 (Field Experiments: Design, Analysis, and Interpretation). | village | $Y_{i}^0$ | $Y_{i}^1$ | treatment effect | |:-------:|:----:|:----:|:----------------:| | v1 | 10 | 15 | 5 | | v2 | 15 | 15 | 0 | | v3 | 20 | 30 | 10 | | v4 | 20 | 15 | -5 | | v5 | 10 | 20 | 10 | | v6 | 15 | 15 | 0 | | v7 | 15 | 30 | 15 | | **avg** |**15**|**20**| **5** | In this example, the experimenter utilized group sizes of 2 (treatment) and 5 (control), which means that we have 21 possible ways the experiment could have been run. $ \frac{7!}{5!2!} = 21 $ As stated before, If we instead assumed that $Y_{i}^1 = Y_{i}^0$ and simulate all of the possible 21 assignments, we can calculate the probability of observing one specific outcome. The possible estimated ATE values under these assumptions are: {-7.5, -7.5, -7.5, -4.0, -4.0, -4.0, -4.0, -4.0, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, 3.0, 3.0, 6.5, 6.5, 6.5, 10.0, 10.0}. If we had run our experiment and found our estimated ATE = 6.5, we could then calculate the *one-tailed* null hypothesis ("the treatment did not **increase** the outcome variable for any of the subjects") by calculating the probability of seeing values $\ge$ to 6.5 $\rightarrow \frac{5}{21} = 24\%$. Repeating this process for the *two-tailed* null hypothesis ("the treatment did not **change** the outcome variable **in either direction**") we calculate the same thing but utilize the *absolute value* of our ATE estimate. Thus, we instead get $\rightarrow \frac{8}{21} = 38\%$. ### Randomization inference with large $N$ The problem with this approach is that as $N \rightarrow \infty$ simulating all assignment probabilities quickly becomes impossible. E.g., an experiment with 50 subjects evenly split between treatment and control groups, the number of simulations becomes astronomical. $ \frac{50!}{25!25!} = 126,410,606,437,752 \text{ possible assignments/simulations} $ However, one solution around this problem is be to generate a large sample of these possible assignments and then repeat the process described above. This allows us to get a close approximate of the sample distribution. **Whether one uses all possible randomizations or a large sample of them, the calculation of p-values based on an inventory of possible randomizations is called randomization inference.** ### Attractive properties 1. Can be applied to a very broad class of hypotheses and applications. 2. Not confined to large samples or normally distributed outcomes. 3. Any sample size can be utilized. 4. Can be applied to all sorts of outcomes: counts, durations, ranks, etc. 5. The method is "exact" in the sense that the set of all possible random assignments fully describes the sampling distribution under the null hypothesis --- #### Related #causal_inference