fsr_linear is a sas macro that
0. Is based on proc glmselect - if you don't have it, first go to
http://support.sas.com:80/rnd/app/da/glmselect.html
1. Standardizes all predictors to have mean 0 and variance 1 and
renames them as x1-xp, numbered as they occur in the data set.
2. Fits main effects, interactions, and squared terms according to a
modified forward selection procedure that on average has False
Selection Rate (FSR) = gamma (default=0.05)
Developed by Hugh Crews, August 2008.
Typical call (all linear and quadratic terms):
%fsr_linear(dataset=diabetes,model=age|sex|bmi|bp|s1|s2|s3|s4|s5|s6 @q
gamma=0.05,y=y,method=5,terms=20,include=0,cbound=2);
method=1 Fast FSR with Strong Hierarchy
method=2 Fast FSR with No Hierarchy
method=3 Fast FSR with Weak Hierarchy
method=4 Fast FSR with No Hierarchy and Iterated Adjustment
method=5 Fast FSR with No Hierarchy and Sequential Adjustment
method=6 Fast FSR with Weak Hierarchy and Sequential Adjustment
method=7 Fast FSR with Main Effects only
The default is method=1, gamma=0.05. Terms=x restricts the forward
sequence to x terms. The default is the full sequence of terms or
(terms=full). Include=k forces the first k terms into the model. The
default is to include no terms or (include=0). Cbound=b bounds the
adjustment used in Methods 5 and 6 by b*(k_T-p+1)/(p+1). The default
is to enforce no limit on c.
The model statement is meant to be close to general sas usage
except for @q which tells the program to add squared terms to
the code immediately preceding it.
Examples:
model=age|sex|bmi|bp|s1|s2|s3|s4|s5|s6 @q all 1st & 2nd order terms
model=age--s6 @q same as above
model=age--s6 only linear
model=age|sex|bmi|bp|s1|s2|s3|s4|s5|s6 linear and interactions only
model=age sex bmi--s6 @q plus include=2
includes age and sex, then selects from full quadratic in the others
model=age sex age--s6 @q plus include=2
same as above, but now age and sex interactions and age^2 are possible
redundancies like age appearing twice are no problem
We recommend method=1 that enforces the strong hierarchy principle
where main effects must enter before interactions. A main benefit of this approach is
that the models chosen are invariant to centering and rescaling. We center and rescale
each variable before running forward selection in order to keep correlations between
main effects and second order terms as low as possible.
If one wants to search for interactions without requiring main effects to enter first, there are three versions of a "no hierarchy" approach, but we recommend the method=5 version because it adjusts for large numbers of interactions and is computationally fast. Method=6 is a compromise between the method=1 strong hierarchy and the no hierarchy approaches. It requires only one main effect to enter before an interaction involving that effect enters. Finally, method=7 is a main effects only approach.
Note that we do not have any special way to handle categorical variables. However, we include a dummy creator macro to create dummy 0-1 variables for an arbitary number of categorical variables.
The macro requires proc glmselect which should be downloaded and installed.
The SAS macro for forward selection with example call and output for the diabetes data. Summary of diabetes data runs.
The SAS macro for forward selection with example call and output for the lucency data. Summary of lucency data runs.
Our logistic regression macro cannot handle data in the events/trials format, but we provide an expansion macro to create a data set with one row for each 0-1 y. Example call and output of the original and expanded data sets.
The German credit example illustrates use of the dummy creator macro before using the fsr_logistic macro, example call and output.
Proportional Hazards (Cox) Regression
The SAS macro for forward selection with example call and output for the AIDS Clinical Trial Group Protocol 175 (ACTG 175) data. Summary of ACTG 175 runs.
Sample code and output for the Primary Biliary Cirrhosis (PBC) Data. Summary of all runs.
A version of the SAS macro for forward selection based on SAS version 9.2 (PHREG changed between versions 9.1 and 9.2).