Skip to content

Model Misspecification

  • Omitted variables
  • Inappropriate form of variables
  • Inappropriate variable scaling
  • Inappropriate data pooling

Vialations of Regesstion Assumtions

Heteroskedasticity

  • Unconditional heteroskedasticity: Heteroskedasticity of the error variance is not correlated with independent variables. Creates no major problems for statistical inference.
  • Conditional heteroskedasticity: Heteroskedasticity of the error variance is correlated with the values of independent variables.

In this case, the coefficient esitimates bj^ are not affected. The standard errors of coefficient sbj^ are usually unreliable. With financial data, the standard errors are most likely **underestimated ** and the t-statistics will be inflated, and tend to find significant relationships where not actually exist (type I error).

To test heteroskedasticity, we can use Breusch-Pagen χ2 test:

H0:no heteroskedasticity.

BPχ2=n×Rresid2,df=k

where n is number of observations, Rresid2 is R2 of a second regression of the squared residuals from the first regression on the independent variables:

ϵi2=a0+a1x1i+a2x2i++ei

To correct Heteroskedasticity, we use adjusted robust standard errors to recalculate tstatistics, or use generalized least squares to fit the model.

Serial Correlation

The error term are correlated with another, typically in time series. Positive serial correlation will increase the chance of error term in same sign, vice versa for negative.

  • Positive serial correlation: standard errors underestimated and t-statistics inflated
  • Negative serial correlation: Overestimated standard errors and underestimated t-statistics

To test serial correlation, we use Durbin-Watson test:

  • The result is in (0,4). If near 4, negative serial correlation, If near 0, positive, If near 2, no serial correlation.
  • It is limited to first order serial correlation.

The Bresusch-Godfrey test:

H0:p1=0,H1:p10.

ϵt^=a0+a1x1t+a2x2t++akxkt+p1ϵt1^+et

or it can be extended with higher orders of p. It uses F test with df=npk1.

To correct serial correlation, we use adjusted Newey-West standard errors or modify the regression itself.

Multicolinearity

It refers to cases where 2 or more independent variables are highly but not perfectly correlated to each other. Estimates of correlation become inprecise and reliable.

  • Similar to negative serial correlation, standard errors overestimated.

To test, we use VIF:

VIFj=11Rj2
  • The smaller the better
  • If VIF>5, further investigation
  • If VIF>10, multicollinearity

To correct:

  • Exclude variables
  • Increase sample size