DOC

Lab 4 SAS Lab

By Jim Grant,2014-08-29 03:47
15 views 0
Lab 4 SAS Lab

Lab 4: SAS Lab

     In today’s lab, we will be using SAS to redo select homework problems. We will not focus on SAS code syntax, but rather the output that SAS gives us and it compare it to what we already have calculated in R. If you would like to later explore the code, I have tried to provide as many comments as possible to indicate what SAS is doing. There are many options in how you do analysis in SAS, the code we will be using today simply requires that you just run the code. For the purpose of keeping the tutorial short, we will only look at residual plots and QQ plots as diagnostics.

The three problems we will revisit are:

HW#2, Problem 1, GPA data

    HW#3, Problem 1, chickwts data

    HW#4, Problem 2, poison data

Getting started:

    First, please download the following data sets from Dr. Bailey’s website and save them to your computer. Remember where you saved them because we will need the path name of the file to bring the data into SAS.

http://www-rohan.sdsu.edu/~babailey/stat700/gpa.dat

    http://www-rohan.sdsu.edu/~babailey/stat700/chickwts.dat

    http://www-rohan.sdsu.edu/~babailey/stat700/poison.dat

    Please locate SAS 9.1 on your computer. Unlike R, SAS outputs to multiple windows. The EDITOR window is where you input your code. The OUTPUT window shows what you have compiled. The LOG window shows the CPU process when you compile, as well as any errors on your code. The GRAPH window shows high resolution plots. For each homework problem that we go over, I highly suggest that you use a new EDITOR window. Let’s

    begin!

     HW#2, Problem 1, GPA data

    Purpose: Linear Models with AIC

Copy the following code into your editor window and change your data file pathname in the ‘infile’ statement. To

    compile the code, either hit the little running man button at the top toolbar or hit F8 on your keyboard:

     /* STAT 700 HW2 P1 GPA data Linear Models with AIC */ option formdlim='_' nodate pageno=1; *formats output window for nicer viewing; title 'STAT 700 HW2 P1 GPA data'; data gpa; infile 'G:\STAT 700\hw2\gpa.dat' firstobs=2; *replace infile with appropriate pathname, and tell SAS to skip header row; input Student GPA SATmath SATverbal HSmath HSenglish; *input header row variables here; *proc print data=gpa; run; *uncomment (remove *) to print data out, it's a good idea to check that SAS imported it properly; proc reg data=gpa outest=est; *in order to see AIC, must ask SAS to output it to another data set; m1: model GPA=SATmath SATverbal HSmath HSenglish /aic; *input full model, ask for AIC option; m2: model GPA=SATmath SATverbal HSmath /aic; *drop HSenglish; m3: model GPA=SATmath SATverbal HSenglish /aic; *drop HSmath; m4: model GPA=SATmath HSmath HSenglish /aic; *drop SATverbal; m5: model GPA=SATverbal HSmath HSenglish /aic; *SATmath; HSenglish: test HSenglish; *tests if HSenglish can be dropped from the model; plot r.*p. /modellab='Residual Plot'; *creates residual plot; plot r.*nqq. /modellab='QQ Plot' noline; *creates qq plot without x=0 line; proc print data=est; run; quit; proc insight data=gpa; *creates multi-way scatter plot;

     scatter SATmath SATverbal HSmath HSenglish*SATmath SATverbal HSmath HSenglish; run; *proc insight opens another analysis feature in SAS, must close that analysis window to continue SAS;

     HW#3, Problem 1, chickwts data

    Purpose: One-Way ANOVA, Pairwise Testing and Subsetting Data

Open a new editor window and clear your LOG and OUTPUT window. Copy the following code into your editor

    window and change your data file pathname in the ‘infile’ statement. Compile the code:

     /* STAT 700 HW3 P1 chickwts data One-Way ANOVA, Pairwise Testing and Subsetting Data */ option formdlim='_' nodate pageno=1; title 'STAT 700 HW3 P1 chickwts data'; data chickwts; infile 'G:\STAT 700\hw3\chickwts.dat' firstobs=2; input chick weight feed $; *when giving SAS a character variable, must append it with $; *proc print data=chickwts; run; proc glm data=chickwts; class feed; *input factor here; model weight=feed; means feed /cldiff bon; *ask SAS for the confidence limits for the difference of means and the type of comparison; output out=resout p=preds rstudent=exstdres; run; quit; /* other SAS options for multiple comparision are: BON, DUNCAN, DUNNETT, DUNNETTL, DUNNETTU, GABRIEL, GT2, LSD, REGWQ, SCHEFFE, SIDAK, SMM, SNK, T, TUKEY, WALLER */ title 'STAT 700 HW3 P1 chickwts data: Boxplots'; proc boxplot data=chickwts; *creates boxplots; plot weight*feed; run; title 'STAT 700 HW3 P1 chickwts data: Residual Plot'; proc gplot data=resout; plot exstdres*preds; run; quit; /* PROC GLM does not offer plot options, and the QQ plot from PROC UNIVARIATE is so low resolution that it is useless. So, here is the three step process for creating QQ plot with SAS code */ *Step 1: Sort the residuals; proc sort data=resout out=sortout; by exstdres; run; *Step 2: Calculate the quantile values for the residuals; data resnorm; set sortout; rankit = probit((_N_-3/8)/(71+1/4)); *in the quantile calculation, must input sample size, here n=71; run; *Step 3: Plot it!; title 'STAT 700 HW3 P1 chickwts data: QQ Plot'; proc gplot data=resnorm; plot exstdres*rankit; run; quit; title 'STAT 700 HW3 P1 chickwts data: Data Subset'; data sunsoy; *create new dataset; set chickwts; *tell SAS from which you want to use;

     if feed='soybean' or feed='sunflowe'; *note, the variable sunflower is 'sunflowe'. SAS truncates variables names to 8 characters; *proc print data=sunsoy; run; proc ttest data=sunsoy; *PROC TTEST automatically computes the test for equal variances when doing paired testing; class feed; *we are comparing two classes of feed: sunflower and soybean; var weight; *we are looking at difference in weight; run;

     HW#4, Problem 2, poison data

    Purpose: Two-Way ANOVA, Profile/Interaction Plots, Data Transformation

    Open a new editor window and clear your LOG and OUTPUT window. Copy the following code into your editor window and change your data file pathname in the ‘infile’ statement. Compile the code:

     /* STAT 700 HW4 P2 poison data Two-Way ANOVA, Profile/Interaction Plots, Data Transformation */ option formdlim='_' nodate pageno=1; *options to format the output window; title 'STAT 700 HW4 P2 poison data: Original Data'; data poison; infile 'G:\STAT 700\hw4\poison.dat' firstobs=2; input poison treatment survival; recipsurv=1/survival; *create new variable, the reciprocal of the survival time; *proc print data=poison; run; proc glm data=poison; class poison treatment; *alpha, beta: our main effects; model survival=poison treatment poison*treatment; *interaction term is alpha*beta; lsmeans poison treatment poison*treatment /out=outmns; *gives least square means and outputs them into another data set called 'outmns'; means poison treatment /cldiff bon; *ask SAS for the confidence limits for the difference of means and the type of comparison; output out=resout p=preds rstudent=exstdres; *outputs the residuals and predicted value to a data set called 'resout'; run; quit; title 'STAT 700 HW4 P2 poison data: Strip Plots'; proc gplot data=poison; plot survival*poison; run; quit; proc gplot data=poison; plot survival*treatment; run; quit; title 'STAT 700 HW4 P2 poison data: Profile/Interaction Plots'; symbol i=j; *tells SAS to draw lines between joint means; proc gplot data=outmns; where poison ne . and treatment ne .; *remove the marginal means from the data set since we only wish to plot joint means; plot lsmean*poison=treatment; plot lsmean*treatment=poison; run; quit; goptions reset=all; *resets PROC GPLOT options; title 'STAT 700 HW4 P2 poison data: Residual Plot'; proc gplot data=resout; plot exstdres*preds; run; quit; title 'STAT 700 HW4 P2 poison data: QQ Plot'; *Step 1: Sort the residuals; proc sort data=resout out=sortout; by exstdres; run;

*Step 2: Calculate the quantile values for the residuals; data resnorm; set sortout; rankit = probit((_N_-3/8)/(48+1/4)); *in the quantile calculation, must input sample size, here n=48; run; *Step 3: Plot it!; proc gplot data=resnorm; plot exstdres*rankit; run; quit; title 'STAT 700 HW4 P2 poison data: Transformed Data'; proc glm data=poison; class poison treatment; *alpha, beta: our main effects; model recipsurv=poison treatment poison*treatment; *interaction term is alpha*beta; lsmeans poison treatment poison*treatment /out=outmns; *gives least square means and outputs them into another data set called 'outmns'; means poison treatment /cldiff bon; *ask SAS for the confidence limits for the difference of means and the type of comparison; output out=resout p=preds rstudent=exstdres; *outputs the residuals and predicted value to a data set called 'resout'; run; quit; title 'STAT 700 HW4 P2 poison data: Strip Plots'; proc gplot data=poison; plot survival*poison; run; quit; proc gplot data=poison; plot survival*treatment; run; quit; title 'STAT 700 HW4 P2 poison data: Profile/Interaction Plots (Transformed Data)'; symbol i=j; *tells SAS to draw lines between joint means; proc gplot data=outmns; where poison ne . and treatment ne .; *remove the marginal means from the data set since we only wish to plot joint means; plot lsmean*poison=treatment; plot lsmean*treatment=poison; run; quit; goptions reset=all; *resets PROC GPLOT options; title 'STAT 700 HW4 P2 poison data: Residual Plot (Transformed Data)'; proc gplot data=resout; plot exstdres*preds; run; quit; title 'STAT 700 HW4 P2 poison data: QQ Plot (Transformed Data)'; *Step 1: Sort the residuals; proc sort data=resout out=sortout; by exstdres; run; *Step 2: Calculate the quantile values for the residuals; data resnorm; set sortout; rankit = probit((_N_-3/8)/(48+1/4)); *in the quantile calculation, must input sample size, here n=48; run; *Step 3: Plot it!; proc gplot data=resnorm; plot exstdres*rankit; run; quit;

Report this document

For any questions or suggestions please email
cust-service@docsford.com