The prediction interval is a interval estimate for a future observation not used to train the model.
Given a linear model $\hat y = X \hat \beta$ we can indicate the predication space with a star.
$\hat y^\star = X^\star \hat \beta + \epsilon$
Essentially we simply multiply the new [[design matrix]] with the vector of predictors $\hat \beta$.
The prediction interval is given by
$\hat y_i \pm t_{\alpha/2, \ (n - (p + 1))} \sqrt{\hat \sigma^2 \Big[x_i^\star (X^TX)^{-1}{x_i^\star}^T + 1} \Big]$
This is derived from the fact that
$\begin{align}
Var(x_i^\star \hat \beta + \epsilon_i) &\overset{indep}{=} Var(x_i^\star \hat \beta) + Var(\epsilon_i) \\
&= \sigma^2 x_i^\star (X^TX)^{-1}{x_i^\star}^T + \sigma^2 \\
&= \sigma^2 \Big[x_i^\star (X^TX)^{-1}{x_i^\star}^T + 1 \Big]
\end{align}$
The prediction interval will always be wider than the confidence interval.
In [[R]], use the `predict` function with an `interval` parameter to get prediction intervals. Pass the new [[design matrix]] as `x` (e.g., the portion of the data reserved for testing).
```R
predict(lm_data, newdata=x, interval="prediction", level=0.95)
```
# interpretation of the prediction interval
A prediction interval is a range of plausible values for the response value of a new measurement at the given predictors.
Imagine we could fix the predictors in the training data and resample the response many times. If we fit another model at the same predictor values and recomputed the prediction intervals, $(1-\alpha)\%$ of the models would include the true value of the response. In practice, this is not possible as it would be next to impossible to find more statistical units that match the original data in all values but the response value.
Prediction intervals account for both the variation in estimating the population mean and also the random variation of the individual response values. For this reason, the prediction interval is always wider than the [[confidence interval for the mean response]].