Warum ist

Anmerkung: $SST$ = Summe der Quadrate insgesamt, $SSE$ = Summe der quadrierten Fehler und $SSR$ = Regressionssumme der Quadrate. Die Gleichung im Titel wird oft geschrieben als:

\sum_{i = 1}^{n} (y_{i} - \bar{y})^{2} = \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})^{2} + \sum_{i = 1}^{n} ({\hat{y}}_{i} - \bar{y})^{2}

$\sum_{i=1}^n (y_i-\bar y)^2=\sum_{i=1}^n (y_i-\hat y_i)^2+\sum_{i=1}^n (\hat y_i-\bar y)^2$

Ziemlich einfache Frage, aber ich suche nach einer intuitiven Erklärung. Intuitiv scheint es mir sinnvoller zu sein, $SST\geq SSE+SSR$ zu sein. Zum Beispiel suppose Punkt $x_i$ y-Wert hat , entspricht $y_i=5$ und , in dem der entsprechende Punkt auf der Regressionsgeraden ist. Nehmen Sie außerdem an, dass der mittlere y-Wert für den Datensatz . Dann für diesen bestimmten Punkt i, $\hat y_i=3$ $\hat y_i$ $\bar y=0$ $SST=(5-0)^2=5^2=25$ , währendund. Offensichtlich. Würde dieses Ergebnis nicht auf den gesamten Datensatz verallgemeinern? Ich verstehe es nicht. $SSE=(5-3)^2=2^2=4$ $SSR=(3-0)^2=3^2=9$ $9+4<25$

regression least-squares r-squared Nocken
quelle

Sehr eng verwandte Themen haben ebenfalls gute Antworten: stats.stackexchange.com/questions/1447 , stats.stackexchange.com/questions/118 , stats.stackexchange.com/questions/123651 , stats.stackexchange.com/questions/204930 und stats.stackexchange.com/questions/127598 .

whuber

Antworten:

Addieren und Subtrahieren ergibt Wir müssen also zeigen, dass

\begin{array}{rcl} \sum_{i = 1}^{n} (y_{i} - \bar{y})^{2} & = & \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i} + {\hat{y}}_{i} - \bar{y})^{2} \\ = & \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})^{2} + 2 \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i}) ({\hat{y}}_{i} - \bar{y}) + \sum_{i = 1}^{n} ({\hat{y}}_{i} - \bar{y})^{2} \end{array}

$\begin{eqnarray*} \sum_{i=1}^n (y_i-\bar y)^2&=&\sum_{i=1}^n (y_i-\hat y_i+\hat y_i-\bar y)^2\\ &=&\sum_{i=1}^n (y_i-\hat y_i)^2+2\sum_{i=1}^n(y_i-\hat y_i)(\hat y_i-\bar y)+\sum_{i=1}^n(\hat y_i-\bar y)^2 \end{eqnarray*}$

\sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i}) ({\hat{y}}_{i} - \bar{y}) = 0

$\sum_{i=1}^n(y_i-\hat y_i)(\hat y_i-\bar y)=0$ . Write

\sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i}) ({\hat{y}}_{i} - \bar{y}) = \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i}) {\hat{y}}_{i} - \bar{y} \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})

$\sum_{i=1}^n(y_i-\hat y_i)(\hat y_i-\bar y)=\sum_{i=1}^n(y_i-\hat y_i)\hat y_i-\bar y\sum_{i=1}^n(y_i-\hat y_i)$ So, (a) the residuals

e_{i} = y_{i} - {\hat{y}}_{i}

$e_i=y_i-\hat y_i$ need to be orthogonal to the fitted values,

\sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i}) {\hat{y}}_{i} = 0

$\sum_{i=1}^n(y_i-\hat y_i)\hat y_i=0$ , and (b) the sum of the fitted values needs to be equal to the sum of the dependent variable,

\sum_{i = 1}^{n} y_{i} = \sum_{i = 1}^{n} {\hat{y}}_{i}

$\sum_{i=1}^ny_i=\sum_{i=1}^n\hat y_i$ .

Actually, I think (a) is easier to show in matrix notation for general multiple regression of which the single variable case is a special case:

\begin{array}{rcl} e^{'} X \hat{β} & = & (y - X \hat{β})^{'} X \hat{β} \\ = & (y - X (X^{'} X)^{- 1} X^{'} y)^{'} X \hat{β} \\ = & y^{'} (X - X (X^{'} X)^{- 1} X^{'} X) \hat{β} \\ = & y^{'} (X - X) \hat{β} = 0 \end{array}

$\begin{eqnarray*} e'X\hat\beta &=&(y-X\hat\beta)'X\hat\beta\\ &=&(y-X(X'X)^{-1}X'y)'X\hat\beta\\ &=&y'(X-X(X'X)^{-1}X'X)\hat\beta\\ &=&y'(X-X)\hat\beta=0 \end{eqnarray*}$ As for (b), the derivative of the OLS criterion function with respect to the constant (so you need one in the regression for this to be true!), aka the normal equation, is

\frac{\partial S S R}{\partial \hat{α}} = - 2 \sum_{i} (y_{i} - \hat{α} - \hat{β} x_{i}) = 0,

$\frac{\partial SSR}{\partial\hat\alpha}=-2\sum_i(y_i-\hat\alpha-\hat\beta x_i)=0,$ which can be rearranged to

\sum_{i} y_{i} = n \hat{α} + \hat{β} \sum_{i} x_{i}

$\sum_i y_i=n\hat\alpha+\hat\beta\sum_ix_i$ The right hand side of this equation evidently also is

\sum_{i = 1}^{n} {\hat{y}}_{i}

$\sum_{i=1}^n\hat y_i$ , as

{\hat{y}}_{i} = \hat{α} + \hat{β} x_{i}

$\hat y_i=\hat\alpha+\hat\beta x_i$ .

Christoph Hanck
quelle

(1) Intuition for why $SST = SSR + SSE$

When we try to explain the total variation in Y ( $SST$ ) with one explanatory variable, X, then there are exactly two sources of variability. First, there is the variability captured by X (Sum Square Regression), and second, there is the variability not captured by X (Sum Square Error). Hence, $SST = SSR + SSE$ (exact equality).

(2) Geometric intuition

Please see the first few pictures here (especially the third): https://sites.google.com/site/modernprogramevaluation/variance-and-bias

Some of the total variation in the data (distance from datapoint to $\bar{Y}$ ) is captured by the regression line (the distance from the regression line to $\bar{Y}$ ) and error (distance from the point to the regression line). There's not room left for $SST$ to be greater than $SSE + SSR$ .

(3) The problem with your illustration

You can't look at SSE and SSR in a pointwise fashion. For a particular point, the residual may be large, so that there is more error than explanatory power from X. However, for other points, the residual will be small, so that the regression line explains a lot of the variability. They will balance out and ultimately $SST = SSR + SSE$ . Of course this is not rigorous, but you can find proofs like the above.

Also notice that regression will not be defined for one point: $b_1 = \frac{\sum(X_i -\bar{X})(Y_i-\bar{Y}) }{\sum (X_i -\bar{X})^2}$ , and you can see that the denominator will be zero, making estimation undefined.

Hope this helps.

--Ryan M.

RMurphy
quelle

When an intercept is included in linear regression(sum of residuals is zero), $SST=SSE+SSR$ .

prove

\begin{array}{rcl} S S T & = & \sum_{i = 1}^{n} (y_{i} - \bar{y})^{2} \\ = & \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i} + {\hat{y}}_{i} - \bar{y})^{2} \\ = & \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})^{2} + 2 \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i}) ({\hat{y}}_{i} - \bar{y}) + \sum_{i = 1}^{n} ({\hat{y}}_{i} - \bar{y})^{2} \\ = & S S E + S S R + 2 \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i}) ({\hat{y}}_{i} - \bar{y}) \end{array}

$\begin{eqnarray*} SST&=&\sum_{i=1}^n (y_i-\bar y)^2\\&=&\sum_{i=1}^n (y_i-\hat y_i+\hat y_i-\bar y)^2\\&=&\sum_{i=1}^n (y_i-\hat y_i)^2+2\sum_{i=1}^n(y_i-\hat y_i)(\hat y_i-\bar y)+\sum_{i=1}^n(\hat y_i-\bar y)^2\\&=&SSE+SSR+2\sum_{i=1}^n(y_i-\hat y_i)(\hat y_i-\bar y) \end{eqnarray*}$ Just need to prove last part is equal to 0:

\begin{array}{rcl} \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i}) ({\hat{y}}_{i} - \bar{y}) & = & \sum_{i = 1}^{n} (y_{i} - β_{0} - β_{1} x_{i}) (β_{0} + β_{1} x_{i} - \bar{y}) \\ = & (β_{0} - \bar{y}) \sum_{i = 1}^{n} (y_{i} - β_{0} - β_{1} x_{i}) + β_{1} \sum_{i = 1}^{n} (y_{i} - β_{0} - β_{1} x_{i}) x_{i} \end{array}

$\begin{eqnarray*} \sum_{i=1}^n(y_i-\hat y_i)(\hat y_i-\bar y)&=&\sum_{i=1}^n(y_i-\beta_0-\beta_1x_i)(\beta_0+\beta_1x_i-\bar y)\\&=&(\beta_0-\bar y)\sum_{i=1}^n(y_i-\beta_0-\beta_1x_i)+\beta_1\sum_{i=1}^n(y_i-\beta_0-\beta_1x_i)x_i \end{eqnarray*}$ In Least squares regression, the sum of the squares of the errors is minimized.

S S E = \sum_{i = 1}^{n} {(e_{i})}^{2} = \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2} = \sum_{i = 1}^{n} {(y_{i} - β_{0} - β_{1} x_{i})}^{2}

$SSE=\displaystyle\sum\limits_{i=1}^n \left(e_i \right)^2= \sum_{i=1}^n\left(y_i - \hat{y_i} \right)^2= \sum_{i=1}^n\left(y_i -\beta_0- \beta_1x_i\right)^2$ Take the partial derivative of SSE with respect to

β_{0}

$\beta_0$ and setting it to zero.

\frac{\partial S S E}{\partial β_{0}} = \sum_{i = 1}^{n} 2 {(y_{i} - β_{0} - β_{1} x_{i})}^{1} = 0

$\frac{\partial{SSE}}{\partial{\beta_0}} = \sum_{i=1}^n 2\left(y_i - \beta_0 - \beta_1x_i\right)^1 = 0$ So

\sum_{i = 1}^{n} {(y_{i} - β_{0} - β_{1} x_{i})}^{1} = 0

$\sum_{i=1}^n \left(y_i - \beta_0 - \beta_1x_i\right)^1 = 0$ Take the partial derivative of SSE with respect to

β_{1}

$\beta_1$ and setting it to zero.

\frac{\partial S S E}{\partial β_{1}} = \sum_{i = 1}^{n} 2 {(y_{i} - β_{0} - β_{1} x_{i})}^{1} x_{i} = 0

$\frac{\partial{SSE}}{\partial{\beta_1}} = \sum_{i=1}^n 2\left(y_i - \beta_0 - \beta_1x_i\right)^1 x_i = 0$ So

\sum_{i = 1}^{n} {(y_{i} - β_{0} - β_{1} x_{i})}^{1} x_{i} = 0

$\sum_{i=1}^n \left(y_i - \beta_0 - \beta_1x_i\right)^1 x_i = 0$ Hence,

\sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i}) ({\hat{y}}_{i} - \bar{y}) = (β_{0} - \bar{y}) \sum_{i = 1}^{n} (y_{i} - β_{0} - β_{1} x_{i}) + β_{1} \sum_{i = 1}^{n} (y_{i} - β_{0} - β_{1} x_{i}) x_{i} = 0

$\sum_{i=1}^n(y_i-\hat y_i)(\hat y_i-\bar y)=(\beta_0-\bar y)\sum_{i=1}^n(y_i-\beta_0-\beta_1x_i)+\beta_1\sum_{i=1}^n(y_i-\beta_0-\beta_1x_i)x_i=0$

S S T = S S E + S S R + 2 \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i}) ({\hat{y}}_{i} - \bar{y}) = S S E + S S R

$SST=SSE+SSR+2\sum_{i=1}^n(y_i-\hat y_i)(\hat y_i-\bar y)=SSE+SSR$

DavidCruise
quelle

This is just the Pythagorean theorem! enter image description here

user0
quelle

stats.stackexchange.com/q/71620/171583, stats.stackexchange.com/a/256532/171583.

ayorgo