Kaz's SAS, HLM, and Rasch Model
Large National Data Sets
Kaz Library
SAS manuals
What is "error"?
Rasch Model
Factor Analysis
Reading output via SAS
Excel functions for Statistical Analysis
Essays on learning statistics
Statistics in Japanese
My profile
My SAS questions and SAS's responses
My work tool box
Welcome to my SAS juku
I collected a lot of examples if you are learning SAS. 


I learned SAS programming by looking at examples.  The programs below are for people who share the similar learning style as mine.  All programs use SAS's default data sets that are in sashelp directory.

proc freq data=R2 nlevels;
where ID ne "" or ID ne ".";
tables ID / noprint;
ods output nlevels=new;

  • PROC MEANS for getting basic descriptive statistics of interval scales.  But don't use this to create mean variables.  Use PROC SQL instead. (Also PROC CORR's descriptive statsitics has a better look than PROC MEANS.  PROC MEANS' result tends to spread across pages and it looks ugly.)
  • PROC SQL to create mean variables.  Most people whose programs I have ever seen take a tedious process of creating mean value variables by using PROC MEANS.  But PROC SQL is a lot better.
  • PROC FACTOR to do factor analysis.  I show a simple example, as well as a macro program that gets you a result like this.  Result in a text formatResult document in rich text format.
  • PROC FACTOR for factor analysis, a macro to deal with more than one set of variables.  Just a modified version of above progam.  I like this version better, espcially for the use of notepad to see the result.
  • DATA steps to manipulate data (sort data sets, keep/drop variables, merge data sets or pile one on top of another, create new variables, etc.)
  • PROC STANDARD is to do the following.  They all are the same things, actually.
    • create Z-scores
    • grand-mean or group mean centering when doing PROC MIXED/HLM
    •  imputation of missing cases based on either grand mean or group mean.
  • Descritive, exploratory analysis Get basic statistical properties of variables.
  • REPORTING of descriptive statistics results: If you are a researcher/research assistent working in a group there is a need to report results in a meaningful manner.  Of course, also useful, to make sense out of your results. Here I show you how ODS (output delivery system) can be used to create a report table that is easy to understand.  The results can be saved in an excel sheet or as an RTF document.
  • REPORTING of Regression Results: I have a macro for you to try at another page of my site.  But it is also useful if you try to figure out this code.  It is a simplified version.  The result of a full version, if you use the macro, the results will look like this.
  • Creating a report that is a text file.  I hate ODS output files because they look ugly and I am not patient enough to learn how to manipulate the styles.
  • PROC MIXED to do multilevel model/HLM. 
  • Comparison of PROC MXIED and PROC REG.  If I had seen these syntaxes before I took my HLM class in 1994, I would have understood the class a lot better.  The message in this one is that you can get the same results as OLS using PROC MIXED if you just don't specify a random statement.  That tells us something.  By the way, I think in fact PROC MIXED is better than PROC OLS for doing a simple linear model like OLS, because MXIED takes a CLASS statement, so we don't have to recode character variables into a series of dummy variables.  I think Maximum Likelihood estimation method will return the same results as OLS if it is a simple linear model.
  • PROC NLMIXED to do multilevel logistic regression.  Also see my attempt to replicate Rasch model with NLMIXED.  The problem of NLMIXED is that it is hard to converge when the model is poorly specified, which happens a lot in social science.  GLMMIX, below, is easier.  Estimation method seems less rigorous or something.
  • PROC NLMIXED to do lots of modeling for the sake of learning.  Min-Ah says, "The default of DF in proc nlmixed is the number of level 2 -1.  So, If one wants to analyze the effects of level 1 variables and level 2 variables (Multilevel) together, he/she has to put an additional option to fix DF for level 1 variables in syntax. So, I used “estimate” commend to fix the DF for level 1 variables."
  • PROC GLIMMIX to do multilevel logistic regression. 
  • A GLIMMIX macro glmm800.sas (This used to be used before PROC GLIMMIX) 
  • Comparison program: Analytical sample versus a full sample.  Have you ever been in a situation where, becuase of a pattern of missing cases in your data, your analytical sample gets very small in size? You will have to explain how the reduced data is different from a full large data.  This program does that check.  For every paper I had to do this, so I finally wrote a macro type of program.  The documentation is poor, so please ask me. kuekawa @ alumni.uchicago.edu
  • AddSuffix program: Add suffix to every variable in the data.
  • Read a file off the internet using filename http URL
  • PROC COMPARE: compare two data sets and see if they are different.
  • X statement allows you to get out of the SAS environment and excute things in MS-DOS environment.
  • SAS DDE Dynamic Data Exchange
  • Be careful with retain statement.  It can mess up the values of variables.
  • proc MI
  • PROC SORT to deal with duplicatge IDS http://analytics.ncsu.edu/sesug/2006/CC14_06.PDF

proc means data=budget;

class ID;

var Funding_Budget1


output out=budget_info(drop=_type_ _freq_)  mean=budget1_average budget2_average; run;


SAS datasets to delete all temporary data sets in a working folder:
proc datasets lib=work nolist kill;
To kill specific datasets in a working folder:
proc datasets library =work  nolist;
    delete name_of_data_here;
How to print log file and output file to external files (so SAS window does not get full and stop when running a long program)

filename printout 'C:\xxxxx\log.txt';

filename logout 'C:\xxxxx\output.txt';

proc printto print=printout log=logout new;




/*this makes sure there is no previous dataset named "alldata"*/
proc datasets library = work nolist;
    delete alldata;

%macro kaz(var1=);

data new&var1;set maindata;

proc append force data=new&var1 out=alldata;

%mend kaz;
%include "c:\temp\example.txt";

What information tables are available in SAS data formats--for each PROC?  This is important to know when you want to use ODS.


Routine work tips

Solving problems in real situations
Case: Imagine you have fifty organizations to report to.  You prepared the report and before you send your report you realize you spotted an error.  Imagine instead of saying "2003" you needed to say "2005."  You don't want to open fifty documents to make this change.
Solution: Activate MS-word macro from within SAS
Case: You need to create a table that has description of variables.  If your labeles are nicely prepared, you can use them for the purpose of creating this table.
Case: You have a gigantic SAS data set and you need to create a text-file version of the same data sets.  You need to create position statement that tells SAS where the variables have to be placed in a text file.
Case: Log file and output file get too long, SAS has to stop in the middle.  To prevent this stopping, save these files into text files.
filename printout 'C:\Documents and Settings\log.txt';
filename logout 'C:\Documents and Settings\output.txt';
proc printto print=printout log=logout new;

Let SAS write a macro based on variable values in a dataset:
%let cate=sex;
/*using a data set and syntax, create a variable
that has a SAS statement.  See variable CODE*/
data syntax; set sashelp.class;

code='%kaz (var1='||&cate||');';

proc sort nodupkey;by &cate;run;

/*Now write out an external text file
that has a macro command*/
data _null_;set syntax;
blank=' ';
file "c:\temp\example.txt"; /*you can change this*/
(code) (30.0);
/*Macro statement*/
/*Exact step is up to what you want to do*/
/*This is an example of breaking the data into
pieces based on the value of NAME*/
%macro kaz (var1=);
data &var1;
set sashelp.class;
if &cate="&var1";
%mend kaz;
/*And read the text file you created earlier*/
%include "c:\temp\example.txt";

Silly but perhaps convinient at least for me


Appling t-test in a data step:
data &schoollevel.&var1;
retain group;
merge ueka1b ueka2b;
by group;
/*two tail 5%*/
if P < 0.05 then SIG="*";
drop A B P1 P2 N1 N2;

Copyright 2005 KU
For information inquiry (AT) estat.us