Homework 3
Here I have assembled the North Carolina Retail Sales data for you up to May of 2007.
The data can be found on our homepage
under datasets .
Look for the data set ncretail_all.sas7bdat. Alternatively I have stored it as a text file.
(1) Just to make the report complete, we start with a graph as in assignment #1.
Plot the data in the Time Series Forecasting System or, if you are interested in code,
plot it with PROC GPLOT. If you so choose, you may want to make the label on the vertical
axis parallel to that axis. To do so you can use an axis statement
axis1 label=(angle=90 font=centb h=1.5) offset=(3,3);
proc gplot; plot ____ /vaxis=axis1;
symbol1 v=circle c=black i=join ci=blue;
Feel free to add any other graphics options you have learned to make the graph nicer.
(2) Does anything look a bit strange to you? Look at this NC Department of Revenue web page to see
if there is something we should add to the model.
http://www.dornc.com/publications/monthlysales.html .
In particular look at the July 2005 report there or the bold "Important: Notice of Change"
and mention what you find.
(3) Suppose I have a level shift (like the 9/11/01 effect in our airline data) that happened on some date.
I can model this by creating a special predictor variable as follows:
X = (date >="11sep2001"d);
Changing the date from "11sep2007"d based on the web page information, add the appropriate X variable to
the data and use PROC PRINT to see what the data look like near the change point (the one you discovered
when you looked at the revenue department web page). Your code could be something like this:
PROC PRINT DATA=_____; WHERE "_____"D < DATE < "_____"D ;
where you would use date constants in the blanks to pick out a few points on either side of the shift
date.
(4) If I do this kind of shift analysis in a simple linear trend case (with a shift), my model in
statement in PROC REG will be something like
PROC REG; MODEL Y = DATE X;
Now I will get a preciction P for Y, say for example P = 10 + .01 DATE - 80 X.
Before the shift date, my predictions will lie on the trend line P= 10 + .01 DATE. After the shift they
lie on another line. What are the intercept _____ and slope ______ of this new line for this
explanatory example (P= 10 + .01 DATE-80 X)?
(5) Now using the idea in (4), the X variable you created in (3), and the full data set as provided in the
link in (1), fit this kind of model to the retail sales data and fill in the estimates you get from the
actual retail sales data:
P = _____ + ____ DATE + _____ X
Overlay the plot of predicted values on the plot of the original data. Using this model, what would you
guess the NC retail sales was in January 1960? Explain why that is the easiest date for which to
estimate sales.
(6) If the coefficient on X were 0, there would be no shift. For the real data, what is the t statistic
for testing the hypothesis that there was no real shift at that date. Recall that Pr>|t| is called the
"P-value" and represents the probability of getting a t like you did if in fact there is no real shfit.
What is the P-value for you t statistic (the one associated with the shift)? If this probability is less
than some cutoff (we often use 0.05) then it is unlikely that the true process satisfies your null
hypothesis (no real shift) and we reject that hypothesis. Using all this information, write up one or two sentence
executive summary on whether or not the event you found on the web is associated with a permanent
shift in the data. (Also just as a note, we could adust the last part of the data by that shift coefficient
to estimate what would have happened without this "intervention". )
(7) (optional) In our previous looks at the earlier part of the data, it seemed that we used seasonal
dummy variables to capture the seasonal effects. Go back and create those 11 dummy variables D1 through
D11 and add them to the model.
(8) (optional) Using the discussion of intervention variables from our book, fit the model in part (5)
using the SAS Time Series Forecasting System
(hint: Go to the "develop models" window and select "fit a custom model")