Data are from McDonald and Schwing (1973), "Instabilities of Regression Estimates Relating Air Pollution to Mortality," Technometrics, 15, 463-481. This data set of 15 independent variables (see list below) and a measure of mortality on 60 US metropolitan areas in 1959-1961 was used to illustrate ridge regression (the full X matrix has a huge condition number).
An interesting feature is that forward addition sequences are a bit different from backward elimination sequences. Here are the least square coefficients for 5-variable forward and backward sequence models (0 estimate means the variable was not selected) and the LASSO estimates chosen by 5-fold cross-validation. The forward selected model includes variables x1, x2, x6, x9, and x14, whereas the backward sequence replaces x1 and x14 by x12 and 13. The LASSO solution is essentially a shrunken version of the forward model with several additional small coefficients on x7 and x8.
FS(5) BE(5) LASSO
int. 1016.433 1145.200 997.584
x1 1.488 0 1.192
x2 -1.623 -1.563 -0.897
x3 0 0 0
x4 0 0 0
x5 0 0 0
x6 -12.764 -19.370 -11.719
x7 0 0 -0.027
x8 0 0 0.002
x9 4.066 4.461 3.396
x10 0 0 0
x11 0 0 0
x12 0 -0.984 0
x13 0 1.992 0
x14 0.284 0 0.217
x15 0 0 0
Description of Variables
Y Total Age Adjusted Mortality Rate x1 Mean annual precipitation in inches x2 Mean January temperature in degrees Fahrenheit x3 Mean July temperature in degrees Fahrenheit x4 Percent of 1960 SMSA population that is 65 years of age or over x5 Population per household, 1960 SMSA x6 Median school years completed for those over 25 in 1960 SMSA x7 Percent of housing units that are found with facilities x8 Population per square mile in urbanized area in 1960 x9 Percent of 1960 urbanized area population that is non-white x10 Percent employment in white-collar occupations in 1960 urbanized area x11 Percent of families with income under 3; 000 in 1960 urbanized area x12 Relative population potential of hydrocarbons, HC x13 Relative pollution potential of oxides of nitrogen, NOx x14 Relative pollution potential of sulfur dioxide, SO2 x15 Percent relative humidity, annual average at 1 p.m.A few descriptive statistics:
The MEANS Procedure
Variable N Mean Std Dev Minimum Maximum
------------------------------------------------------------------------------
x1 60 37.3666667 9.9846775 10.0000000 60.0000000
x2 60 33.9833333 10.1688985 12.0000000 67.0000000
x3 60 74.5833333 4.7631768 63.0000000 85.0000000
x4 60 8.7983333 1.4645520 5.6000000 11.8000000
x5 60 3.2631667 0.1352523 2.9200000 3.5300000
x6 60 10.9733333 0.8452994 9.0000000 12.3000000
x7 60 80.9133333 5.1413731 66.8000000 90.7000000
x8 60 3876.05 1454.10 1441.00 9699.00
x9 60 11.8700000 8.9211480 0.8000000 38.5000000
x10 60 46.0816667 4.6130431 33.8000000 59.7000000
x11 60 14.3733333 4.1600956 9.4000000 26.4000000
x12 60 37.8500000 91.9776732 1.0000000 648.0000000
x13 60 22.6500000 46.3332896 1.0000000 319.0000000
x14 60 53.7666667 63.3904678 1.0000000 278.0000000
x15 60 57.6666667 5.3699309 38.0000000 73.0000000
y 60 940.3585000 62.2066852 790.7300000 1113.16
------------------------------------------------------------------------------
Correlations:
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 y
x1 1.00 0.09 0.50 0.10 0.26 -0.49 -0.49 0.00 0.41 -0.30 0.51 -0.53 -0.49 -0.11 -0.08 0.51
x2 0.09 1.00 0.35 -0.40 -0.21 0.12 0.01 -0.10 0.45 0.24 0.57 0.35 0.32 -0.11 0.07 -0.03
x3 0.50 0.35 1.00 -0.43 0.26 -0.24 -0.42 -0.06 0.58 -0.02 0.62 -0.36 -0.34 -0.10 -0.45 0.28
x4 0.10 -0.40 -0.43 1.00 -0.51 -0.14 0.07 0.16 -0.64 -0.12 -0.31 -0.02 0.00 0.02 0.11 -0.17
x5 0.26 -0.21 0.26 -0.51 1.00 -0.40 -0.41 -0.18 0.42 -0.43 0.26 -0.39 -0.36 0.00 -0.14 0.36
x6 -0.49 0.12 -0.24 -0.14 -0.40 1.00 0.55 -0.24 -0.21 0.70 -0.40 0.29 0.22 -0.23 0.18 -0.51
x7 -0.49 0.01 -0.42 0.07 -0.41 0.55 1.00 0.18 -0.41 0.34 -0.68 0.39 0.35 0.12 0.12 -0.43
x8 0.00 -0.10 -0.06 0.16 -0.18 -0.24 0.18 1.00 -0.01 -0.03 -0.16 0.12 0.17 0.43 -0.12 0.27
x9 0.41 0.45 0.58 -0.64 0.42 -0.21 -0.41 -0.01 1.00 0.00 0.70 -0.03 0.02 0.16 -0.12 0.64
x10 -0.30 0.24 -0.02 -0.12 -0.43 0.70 0.34 -0.03 0.00 1.00 -0.19 0.20 0.16 -0.07 0.06 -0.28
x11 0.51 0.57 0.62 -0.31 0.26 -0.40 -0.68 -0.16 0.70 -0.19 1.00 -0.13 -0.10 -0.10 -0.15 0.41
x12 -0.53 0.35 -0.36 -0.02 -0.39 0.29 0.39 0.12 -0.03 0.20 -0.13 1.00 0.98 0.28 -0.02 -0.18
x13 -0.49 0.32 -0.34 0.00 -0.36 0.22 0.35 0.17 0.02 0.16 -0.10 0.98 1.00 0.41 -0.05 -0.08
x14 -0.11 -0.11 -0.10 0.02 0.00 -0.23 0.12 0.43 0.16 -0.07 -0.10 0.28 0.41 1.00 -0.10 0.43
x15 -0.08 0.07 -0.45 0.11 -0.14 0.18 0.12 -0.12 -0.12 0.06 -0.15 -0.02 -0.05 -0.10 1.00 -0.09
y 0.51 -0.03 0.28 -0.17 0.36 -0.51 -0.43 0.27 0.64 -0.28 0.41 -0.18 -0.08 0.43 -0.09 1.00