Linear Models Lecture 2: Inference with Linear Regression

Overview

In this notebook we cover how to do inference with linear regression.

Recap

To recap: we were able to get point estimators for the linear regression model coefficients,

$$ b_1 = \dfrac{ \sum_i \left( X_i - \overline{X} \right) \left( Y_i - \overline{Y} \right) }{ \sum_i \left( X_i - \overline{X} \right)^2 } $$

and

$$ b_0 = \overline{Y} - b_1 \overline{X} $$

Coefficient b1

We are going to discuss some statistics for $b_1$, namely, how to obtain an estimate of $b_1$ along with confidence intervals.

From the Gauss Markov theorem for an unbiased point estimator $b_1$, we know:

$$ E(b_1) = \beta_1 \\ \sigma^2 (b_1) = \dfrac{ \sigma^2 }{ \sum \left( X_i - \overline{X} \right)^2 } $$

Note also that $\sigma^2(b_1)$ is the variance value for $b_1$ distribution, whereas $\sigma^2$ is the variance value for the $Y_i$.

To get an estimate $s^2(b_1)$ of the variance value $\sigma^2(b_1)$ for the $b_1$ distribution, use the mean squared error:

$$ s^2 (b_1) = \dfrac{ MSE }{ \sum \left(X_i - \overline{X} \right)^2 } = \dfrac{ MSE }{ \sum X_i^2 - \dfrac{ \left( \sum X_i \right)^2 }{n} } $$

Now, we can use the t-test for the standardized statistic, which is given by:

$$ \dfrac{ b_1 - \beta_1 }{ s(b_1) } $$

Specifically, we know that the standardized statistic should be distributed like a t distribution with n-2 degrees of freedom:

$$ \dfrac{ b_1 - \beta_1 }{ s(b_1) } \sim t(n-2) $$

Why use the t distribution? The applicability of the t distribution follows from the following theorem:

$\dfrac{SSE}{\sigma^2}$ is distributed as $\chi^2$ with n-2 degrees of freedom. That is,

$$ \dfrac{SSE}{\sigma^2} \sim \chi^2(n-2) $$

Now, the final statement that we are able to make is something like this:

$$ P \left\{ t \left( \frac{\alpha}{2}; n-2 \right) \leq \dfrac{b_1 - \beta_1}{s(b_1)} \leq t \left( 1-\frac{\alpha}{2}; n-2 \right) \right\} = 1-\alpha $$

This still isn't easy to use, but with a bit of rearrangement we can get:

$$ b_1 \pm t \left( 1 - \frac{\alpha}{2}; n-2 \right) s \left( b_1 \right) $$

Coefficient b0

Appply the same procedure to the coefficient $\beta_0$ and its unbiased linear estimator $b_0$:

$$ b_0 = \overline{Y} - b_1 \overline{X} $$

Statistics for $b_0$:

$$ E(b_0) = \beta_0 $$

and the variance of the $b_0$ distribution:

$$ \sigma^2(b_0) = \sigma^2 \left( \dfrac{ \sum_i X_i^2 }{ n \sum_i \left( X_i - \overline{X} \right)^2 } \right) $$

or, rewriting,

$$ \sigma^2(b_0) = \sigma^2 \left( \frac{1}{n} + \frac{\overline{X}^2}{ \sum \left( X_i - \overline{X} \right)^2 } \right) $$

(where, as before, $\sigma^2$ is the variance of the response distribution, while $\sigma^2(b_0)$ is the variance of the $b_0$ distribution.)

Now the variance of $b_0$ can be estimated via the MSE:

$$ s^2(b_0) = MSE \left( \frac{1}{n} + \dfrac{ \overline{X}^2 }{ \sum \left( X_i - \overline{X} \right)^2 } \right) $$

Using the same arguments from above (about the SSE being distributed according to a $\chi^2$ distribution), we get the following confidence interval for $b_0$:

$$ b_0 \pm t \left( 1 - \frac{\alpha}{2} ; n-2 \right) s \left( b_0 \right) $$

System Response Yh

We'll just keep applying the same procedure as above. This time, we have a new input $X_h$, and we have a system response $Y_h$. We wish to determine a confidence interval for this quantity.

$Y_h$ is the model evaluated at the input $X_h$

$$ \hat{Y}_h = b_0 + b_1 X_h $$

Statistics:

$$ E(\hat{Y}_h) = Y_h \\ E(\hat{Y}_h) = \beta_0 + \beta_1 X_h $$

and the variance is given by:

$$ \sigma^2( \hat{Y}_h ) = \sigma^2 \left( \frac{1}{n}

  • \dfrac{ \left( X_h - \overline{X} \right)^2 }{ \sum \left( X_i - \overline{X} \right)^2 } \right) $$

To estimate this quantity, we should use the MSE:

$$ s^2(\hat{Y}_h) = MSE \left( \frac{1}{n}

  • \dfrac{ \left(X_h - \overline{X}\right)^2 }{ \sum \left( X_i - \overline{X} \right)^2 } \right) $$

Finally, we assume the SSE is distributed according to a $\chi^2$ distribution, which allows us to assemble confidence intervals from a t distribution using the normal statistic:

$$ \hat{Y}_h \pm t \left( 1 - \frac{\alpha}{2}; n-2 \right) s \left( \hat{Y}_h \right) $$

Group Predictions

If making predictions of $m$ observations for a given $X_h$, there are two factors to think about: the variance in where the distribution of $\hat{Y}_h$ is centered, and the variance in how $\hat{Y}_h$ is distributed.

Accordingly,

$$ \hat{Y}_h \pm t \left( 1 - \frac{\alpha}{2}; n-2 \right) s \left( \overline{Y}_{h(new)} \right) $$

and the variance estimate comes from:

$$ s^2 \left( \overline{Y}_{h(new)} \right) = MSE \left( \frac{1}{m} + \frac{1}{n}

  • \dfrac{ \left( X_h - \overline{X} \right)^2 }{ \sum \left( X_i - \overline{X} \right)^2 } \right) $$