Diabetes Data

SAS code to access the data using the original data set from Trevor Hastie's LARS software page.

Proc Means and Proc Print Output when using the above data.

The data from the R package lars. SAS code to access these data. Proc Means and Proc Print Output when using the above data from R. Note that the 10 x variables have been standardized to have mean 0 and squared length = 1 (sum(x^2)=1).

Brief Description

From Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani (2004) "Least Angle Regression," Annals of Statistics (with discussion), 407-499, we have

"Ten baseline variables, age, sex, body mass index, average blood pressure, and six blood serum measurements were obtained for each of n = 442 diabetes patients, as well as the response of interest, a quantitative measure of disease progression one year after baseline."

In the tab delimited file above, the variables are named

AGE SEX BMI BP S1 S2 S3 S4 S5 S6 Y
whereas, in the R file, they are named
age sex bmi map tc ldl hdl tch ltg glu y