BYU
Go to RouteY Brigham Young University

Richard Butler

VIF and Correlation Example

sample 1 33
read sales income pop c
107919 13240 65044 3
118866 22554 101376 5
98579 16916 124989 7
122015 20967 55249 2
152827 19576 73775 3
91259 15039 48484 5
123550 21857 138809 8
160931 26435 50244 2
98496 24024 104300 6
108052 14987 37852 2
144788 30902 66921 3
164571 31573 166332 4
105564 19001 61951 3
102568 20058 100441 5
103342 16194 39462 2
127030 21384 139900 5
166755 18800 171740 6
125343 15289 149894 6
121886 16702 57386 3
134594 19093 185105 6
152937 26502 114520 3
109622 18760 52933 3
149884 33242 203500 5
98388 14988 39334 4
140791 18505 95120 3
101260 16839 49200 3
139517 28915 113566 4
115236 19033 194125 9
136749 19200 233844 7
105067 22833 83416 7
136872 14409 183953 6
117146 20307 60457 3
163538 20111 65065 2
* next I add an option that computes correlation between the variables
* rough rule of thumb: correlations above .7 may be multicollinearity
* this is the pcor option in stat
* another test for collinearity is the variance inflation factor VIF
* where values above 5, and certainly above 20ish, indicate multicoll...
* note below that the correlation indicates potential c-income problem
* but the VIF are fine, indicating no problem at all: go with the VIFs
stat income pop c /pcor cor=s
matrix vif=diag(inv(s))
print vif
* alternative computation of VIF for first term, regress it on other vars
ols income pop c
gen1 vif_inc=1/(1-$R2)
print vif_inc
*original ols model follows
ols sales income pop c
end
stop