Skip to content

Evaluating Regression Model

Goodness of Fit

SourcedfSSMS
RegressionkRSSMSR = RSS / k
Errorn - k - 1SSEMSE = SSE / (n - k - 1)
Totaln - 1SST
  • Starndard error of estimate: SSE=MSE.
  • F statistics =MSR/MSE
  • R2=RSS/SST

NOTE

Model R2 will always increase with increasing model complexity, leading to overfitting problem. Therefore we need to make Adjusted R2 to punish complexity:

R2¯=1(n1nk1(1R2))

Based on the idea, a stricter standard is AIC:

AIC=nln(SSEn)2(k+1)

AIC is better when lower. 2(k+1) is penalty for complexity

An adjusted standard is BIC (stricter):

BIC=nln(SSEn)+lnn(k+1)

Hypothesis Testing

To test whether an independent variable is significant to the dependent variable, we use t statistics:

H0:bi=0,H1:b10

t=bi^sbi^

To include correlations between independent variables and view the model as a whole, we use F test:

H0:bi=0i,H1:i,bi=0.

F=MSRMSE

NOTE

Restricted F-test is used to test if part of variables are significant. The method is to exclude the tested variables from the original model to create a restricted model:

H0:bk=0k,H1:k,bk0.

F=(SSErestrictedSSEunrestricted)/qSSEunrestricted/(nk1)