Using categorical predictors in a multiple regression.
In 1992 New Jersey's minimum wage increased from $4.25 to $5.05. The
fastfood data set contains information on 410 fastfood restaurants in New Jersey and
eastern Pennsylvania. The objective of this study is to determine the effect of the minimum wage
increase on employment and food price. For each restaurant we compute the change in entree price
before and after the wage increase (CHANGEPRICE), the state (STATE: 0 = PA, no change in MW;
1=NJ, increased MW), and the chain (1=Burger King; 2=KFC; 3=Roy Rogers; 4=Wendys). More
information about the data can be found
here.
The SAS code below performs several regressions to investigate the relationships between change in price, state, and chain.
data fastfood;
set fastfood;
*create the outcome variable;
CHANGEPRICE = PENTREE2-PENTREE;
*create dummy variables for CHAIN;
KFC = 0; if CHAIN = 2 THEN KFC = 1;
ROYR = 0; if CHAIN = 3 THEN ROYR = 1;
WENDY = 0; if CHAIN = 4 THEN WENDY = 1;
*create interaction variables;
KFCSTATE = KFC*STATE;
ROYRSTATE = ROYR*STATE;
WENDYSTATE = WENDY*STATE;
run;
proc reg data = fastfood;
title "One factor model - STATE";
model CHANGEPRICE = STATE;
run;
proc ttest data = fastfood;
title "t-test for STATE";
var CHANGEPRICE;
class STATE;
run;
proc reg data = fastfood;
title "One factor model - CHAIN";
model CHANGEPRICE = KFC ROYR WENDY;
run;
proc reg data = fastfood;
title "Two factor model - STATE, CHAIN";
model CHANGEPRICE = STATE KFC ROYR WENDY;
CHAIN: test KFC, ROYR, WENDY;
run;
proc reg data = fastfood;
title "Two factor model with interactions - STATE x CHAIN";
model CHANGEPRICE = STATE KFC ROYR WENDY
KFCSTATE ROYRSTATE WENDYSTATE;
INTERACTIONS: test KFCSTATE, ROYRSTATE, WENDYSTATE;
run;
**************************************************************************;
*********************** VARIABLE SELECTION CODE **********************;
**************************************************************************;
* You can perform variable selection in SAS's analyst window as well.;
* Click statistics->regression->linear->model and then select the procedure;
* you would like to use.;
proc reg data = fastfood;
title "All-subsets variable selection using adj R2";
model CHANGEPRICE = WENDY KFC RROY CO_OWNED STATE EMPFT EMPPT
NMGRS WAGE_ST INCTIME FIRSTINC BONUS PCTAFF HRSOPEN PENTREE NREGS
/ selection=ADJRSQ;
run;
proc reg data = fastfood;
title "All-subsets variable selection using Cp";
model CHANGEPRICE = WENDY KFC RROY CO_OWNED STATE EMPFT EMPPT
NMGRS WAGE_ST INCTIME FIRSTINC BONUS PCTAFF HRSOPEN PENTREE NREGS
/ selection=CP;
run;
proc reg data = fastfood;
title "Forward selection using alpha = 0.2";
model CHANGEPRICE = WENDY KFC RROY CO_OWNED STATE EMPFT EMPPT
NMGRS WAGE_ST INCTIME FIRSTINC BONUS PCTAFF HRSOPEN PENTREE NREGS
/ selection=forward sle=0.2;
run;
proc reg data = fastfood;
title "Stepwise selection using alpha = 0.2 for entry and alpha = 0.05 for exit";
model CHANGEPRICE = WENDY KFC RROY CO_OWNED STATE EMPFT EMPPT
NMGRS WAGE_ST INCTIME FIRSTINC BONUS PCTAFF HRSOPEN PENTREE NREGS
/ selection=stepwise sle=0.2 sls=0.05;
run;