* Data: O-ring erosion or blowby, USA space shuttle launches. * * Checks for leaks were done at different pressures (PSI) * * Space shuttle Challenger was launched at 31 degrees F, just * * below freezing, and exploded killing all on board shortly * * after liftoff. There were 23 shuttle missions prior to the * * Challenger disaster. These are the results of O-ring * * inspections for them. O-ring (seal) failure was the cause of * * the Challenger disaster. * *****************************************************************; **** DATAMINIG - first assignment. Due Feb. 13 ******* Space shuttle data: 1. Run the program below. We will discuss the method (logistic regression) soon. Look at the Chi-square statistics and p-values from PROC LOGISTIC. Which variables (launch temperature, launch sequence) matter in terms of the probability of O-ring failure? Explain briefly. 2. Modify the program so that it prints out the data for the last graph. What would have been the probability of 4 or more O-ring failures had the 23rd launch been done at 31 degrees F. (ambient temperatures were in the low 30's for Challenger) 3. Add the pre-launch pressure check to your model. Again check the Chi-square statistics from PROC LOGISTIC. Do you think there is any effect on the O-rings from this pre-launch pressure check (i.e. we might be worried that the check itself damages the rings). Use statistical arguments (of course!). 4. Again use PROC LOGISTIC. Omit everything but temperature from the model. Find and report the odds ratio. If an event has probability 2/3 of occuring, the odds are 2 to 1. If, when temperature increases by 1, we have probability 3/4 of occurence, then the odds are 3 to 1 and the odds ratio is 3/2 = 1.5, that is, an increase in temperature of 1 multiplies the odds by 1.5. What is the effect on the odds of failure if the temperature is increased 1 degree? What is better (safer) - cold or hot? 5. Import the data into Enterprise Miner. Fit the logistic regression on temperature there. Add a reporter node. What does the regression output say about the temperature effect? I have stored the O_ring dataset in crssamp. For this part, hand in the results of the reporter node. ; goptions reset=all; Data Shuttle; input Atrisk failed temp LCpress launch; * 6 O-rings per mission: Atrisk=6 everywhere. * * Failed = number out of 6 exhibiting erosion * * or blowby. *; label temp= "Temperature at Liftoff"; label launch = "Launch number"; label LCpress = "Leak Check pressure (PSI)"; cards; 6 0 66 50 1 6 1 70 50 2 6 0 69 50 3 6 0 68 50 4 6 0 67 50 5 6 0 72 50 6 6 0 73 100 7 6 0 70 100 8 6 1 57 200 9 6 1 63 200 10 6 1 70 200 11 6 0 78 200 12 6 0 67 200 13 6 2 53 200 14 6 0 67 200 15 6 0 75 200 16 6 0 70 200 17 6 0 81 200 18 6 0 76 200 19 6 0 79 200 20 6 2 75 200 21 6 0 76 200 22 6 1 58 200 23 ; data crssamp.O_ring; set shuttle; do ring=1 to 6; if ring>failed then fail=0; else fail=1; output ; keep temp launch LCpress fail; end; proc print data=shuttle; title "Shuttle data set"; proc print data=O_ring; title "O_ring data set"; run; ************* Add a grid for graphics ********; data shuttle; set shuttle end=EOF; id = "obsns"; output; if eof then do temp = 31 to 81 by 5; do launch = 1 to 23 by 2; failed=. ; id = "grid"; output; end; end; ************** Graph observed data *************; proc g3d data=shuttle; scatter launch*temp=failed / shape='prism'zmin=0; title "USA Space Shuttle Flights"; title2 "Failure = erosion or blowby"; ************** Estimate Logistic two ways ******; proc logistic data=shuttle; title3 "Logistic Regression"; model failed/atrisk = temp launch; output out=out1 predicted = p; run; proc genmod data=shuttle; title3 "Logistic Regression"; model failed/atrisk = temp launch/dist=binomial; run; ************** Plot prob(failure) vs. temp, launch # ****; proc g3d; plot launch*temp =p/ tilt = 82 rotate=25; where id = "grid"; label p = "Pr{O-ring failure}"; ************** Compute and plot Pr(4 or more failures) ****; data next; set out1; Prob_4=0; do i = 0 to 3; pi = gamma(7)/(gamma(i+1)*gamma(7-i)); if _n_=1 then put pi; ** check 6! /(i!)((6-i)!) ***; pi = pi*(p**i)*((1-p)**(6-i)); prob_4 = prob_4 + pi; end; prob_4=1-prob_4; title3 "Pr{4 or more failures} vs. temperature & launch no."; proc g3d; plot launch*temp =prob_4/ tilt = 82 rotate=25; where id = "grid"; label prob_4 = "Pr{4 or more}"; run;