Skip to content

Extensions of Multiple Regression

Influence Analysis

High leverage point

High leverage point: An extreme value of independent variable

Leverage measures the distance. It is usually between 0 and 1, the higher, the more influence on some observation.

  • Rule of Thumb: Exceeds 3(k+1n) poentially influential observation.

Outlier

An extreme point in dependent variable

Studentized residual ti Compare the observed y with the predicted y from the restricted model with some observation deleted.

  • t-test with degree of df=nk2. H0: potentially influential observation

Cook's Distance

  • Higher, more problematic
  • Higher, more likely influentgial

Qualitative Variables

Dummy Variables

Dummy variables can be used to represet qualitative variables in regression analysis, to represent categories.

NOTE

We use n1 dummy variables to represent n categories to avoid multicollinearity

Logit Model

Studying the odds for win or loss (0 or 1): ln(p1p)=b0+b1x1++ϵ with p=11+e(b0+b1++ϵ).

To fit the model, we use maximum likelihood estimation (MLE) and use pseudo R2 to evaluate. Similarly, log likelihood can be used to evaluate, higher better.

Similar to joint F-test, we use Likelihood Ratio (LR) test to test if some variables are significant. LR larger means the variable are significant (reject H0).