View
230
Download
0
Category
Preview:
Citation preview
8/13/2019 SAS Handbook
1/20
Math 338 - Introduction to SAS
Fall 2013
SAS HANDBOOKBy: Luis Montes
8/13/2019 SAS Handbook
2/20
Table of Contents
1.DATA MANAGEMENT1.1 Data Step
A. DATA Statement OptionsB. Defining VariablesC. Input StatementD. Datalines StatementE. Set StatementF. Merge StatementG. Length StatementH. Label StatementI. If-Else StatementJ. Infile StatementK. Do StatementL. Keep-Drop StatementsM. Output StatementN. Generating Random NumbersO. Internal ValuesP. Format and Informat StatementsQ. File StatementR. Put StatementS. Array Statement1.2 Proc Import StepA. Proc Import Statement OptionsB. Getnames Statement
1.3 Statements Outside of Data andProcedure Steps
A. Libname StatementB. Quit Statement
2.SORTING,PRINTING,ANDSUMMARIZING DATA2.1 Proc Print Step
A. Proc Print Statement OptionsB. ID StatementC. By StatementD. Sum StatementE. Title & Footnote StatementsF. Var StatementG. Sumby Statement
2.2 Proc Frequency StepA. Proc Frequency Statement OptionsB. Weight StatementC. Tables StatementD. Where Statement
2.3 Proc Contents StepA. Proc Contents Statement Options
2.4 Proc Tabulate StepA. Proc Tabulate Statement OptionsB. Class StatementC. Var StatementD. Table Statement
2.5 Proc Sort StepA. Proc Sort Statement Options2.6 Proc GChart Step
A. Proc GChart Statement OptionsB. HBar, VBar, and VBar3DC. Block Statement
2.7 Proc GPlot StepA. Proc GPlot Statement Options
B. Plot StatementC. Symbol Statement
8/13/2019 SAS Handbook
3/20
2.8 Proc Format StepA. Proc Format Statement OptionsB. Value StatementC. Picture Statement
3.STATISTICAL ANALYSIS IN SAS3.1 Proc Univariate Step
A. Proc Univariate Statement OptionsB. Var StatementC. Histogram Statement
3.2 Proc Means StepA. Proc Means Statement OptionsB. Var Statement
3.3 Proc ttest StepA. Proc ttest Statement OptionsB. Class StatementC. Var StatementD. Paired Statement
3.4 Proc Corr StepA. Proc Corr Statement OptionsB. Var Statement
3.5 Proc Reg StepA. Proc Reg Statement OptionsB. Model StatementC. Plot Statement
3.6 Proc GLM StepA. Proc GLM Statement OptionsB. LSMeans Statement
3.7 Proc Logistic StepA. Proc Logistic StatementB. Class StatementC. Model Statement
8/13/2019 SAS Handbook
4/20
1DATA MANAGEMENT1.1DATA STEPA. Data Statement Options
DATADATA-SET-NAME-1
8/13/2019 SAS Handbook
5/20
This is a trailing @. It must be the last item in the input statement or else itbecomes a pointer control. It holds the input reader at the final location, and the
next input statement continues at this spot.
-#n
This is a line pointer. It moves the input reader to row n.-/
Advances the input reader to the first column of the next line.-@n
This is a column pointer. It moves the input reader to column n. n must be aninteger.
D. Datalines StatementSYNTAX: DATALINES;
-With no options, the datalines statement is followed by raw data entered by the user.
SAS software displays this by highlighting the raw data in yellow.
-Delimiter= option Specifies what is delimiting the raw data. By default SAS uses one space as a
delimiter, but it can also use commas or tabs (dlm=09x) among many others.
E. Set StatementSYNTAX: SETDATA-SET(S);
-Recall that the DATA step is itself a loop being applied to a data set. Whenever the Set
statement is read, it reads one row of observations (including all variables), into the
program data vector, whichcan be manipulated in the data set and even output if
desired.
-IN=option
This option generates a new variable (which we name), which takes a value of 1 ifthe data set contributes to an observation and take a value of 0 otherwise.
F. Merge StatementSYNTAX: MERGEDATA-SET(S);
8/13/2019 SAS Handbook
6/20
-The Merge statement differs from the set statement in that instead of combining datasets by stacking observations vertically, the merge statement combines observations of
data sets horizontally, adding variables. A BY;statement following a merge
statement is very helpful.
G. Length StatementSYNTAX: LENGTHVARIABLE-1VARIABLE-1-LENGTH;
-The length statement changes the length of a variable to 2-8 or 3-8 for numeric variables
(depending on operating environment) and 1-32767 for alphanumeric variables.
Variables can also be defined in the length statement, as such, placing a $ after a
variable name specifies it as an alphanumeric variable.
H. Label StatementSYNTAX: LABEL=;
-The label statement changes the face name of the variable it is applied to. If it is applied
in a data step, the label is permanently associated with the variable. It can be applied in a
procedure step, but if it is not used in the data step, the label will not be used outside the
procedure step.
I. If-Else StatementSYNTAX: IF(LOGICALEXPRESSION)THEN(STATEMENT);;
-SAS reads the logical expression after IF and if it returns a TRUE value, then it executes
the statement after THEN. An ELSE statement is not necessary but it need follow the IF
statement, and its statement is executed if the logical expression after IF returns a FALSE
value.
J. Infile StatementSYNTAX: INFILEFILE-PATH;
-The file-path is a pathway to an external file we want to pull into SAS, such as a .txt file.
Just as it was used for the datalines statement, DLM= can be used as an optionhere.
-FLOWOVER option
The default method of reading for infile. When a data set has a missing value, it isskipped and the input reader gives a variable the character that follows.
8/13/2019 SAS Handbook
7/20
-MISSOVER option
The input reader continues onto the next variable when it detects a missingvalue, and specifies remaining variables (when it reaches end of input line) as
missing values.
-STOPOVER option
The input reader is stopped and it omits a row when it detects a missingvalue.
The figure to the right is a screenshot of examples for MISSOVER, FLOWOVER, and
STOPOVER options for the infile statement. They are applied to the data set:
1, 2, 31, , 3, 2, 3
K. Do StatementSYNTAX:DOINDEX-VAR=SPECIFICATION;SASSTATEMENT(S)
-Conditional Do Loops (While)
We have the option to have SAS execute statements while a logical expression istrue. The logical expressions value is checked after all the statements are
executed.
-Conditional Do Loops (Until)
We have the option to have SAS execute statements until a logical expressionbecomes true. The logical expressions value is checked before any of the
statements are executed.
-Iterative Do Loops (Ex. i=1 to 100 by 5)
We can also have SAS execute a statement a finite number of times, while alsocreating an iterative variable. The by option designates the increment
L. Keep-Drop Statement
8/13/2019 SAS Handbook
8/20
SYNTAX: DROPVARIABLE-1,VARIABLE-N;
KEEPVARIABLE-1,VARIABLE-N;
-The Drop statement drops all listed variables in the data set. Variables not listed remain.
-The Keep statement keeps all listed variables in the data set. Variables not listed are
dropped.
-Keep and Drop can also be used as options in a set statement, in the form: SET DATA
(KEEP=VARIABLE);
M.Output StatementSYNTAX: OUTPUT;
-Without listing data sets after OUTPUT, the OUTPUT statement writes the current
observation to all data sets in the data statement. Otherwise, only the data sets listed
take the current observation.
N. Generating Random NumbersSYNTAX: VARIABLE=RAND(DISTRIBUTION);
-The random function generates a random number with a given distribution.
RAND(BINOMIAL,p,n) ~ Bin(p,n) RAND(GEOMETRIC,p) ~ Geom(p) RAND(POISSON,m) ~ Pois(m) RAND(UNIFORM) ~ U(0,1) RAND(BERNOULLI,p) ~ Bern(p)
O. Internal Values _N_ : The number of observations in the DATA set.
P. Format and Informat StatementsSYNTAX: FORMATVARIABLE-1FORMAT-1
INFORMATVARIABLE-1INFORMAT-1
8/13/2019 SAS Handbook
9/20
-The format statement changes the appearance of a variable without changing the
original variable. A list of formats can be found at:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm a001263753.htm.-The informat statement tells SAS to permanently change the raw data form of a variableinto a formatted form. Informats can also be applied in the input statement. Informats
for SAS 9.2 can be found at:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm a001239776.htm.
Q.File StatementSYNTAX: FILEFILE-PATH
-The file statement creates an external file that will be written by the put statements in
the data step. We can also use the print device so that the created external file is alsodisplayed in the output window.
R. Put StatementSYNTAX: PUTVARIABLE;
-The put statement works similarly to the input statement, only it is applied to the
printing of an external file given by the file statement.
S. Array StatementSYNTAX:ARRAYARRAY-NAME{SUBSCRIPT};
-SAS generates an array with the ARRAY statement. The name, subscript, whether or not
its alphanumeric (placing the $ symbol), length and elements are generated by the user.
1.2PROC IMPORT STEPA. Proc Import Statement Options
SYNTAX: PROCIMPORTDATAFILE=FILE-PATHOUT=DATA-SET;
-The proc import step is helpful for importing large files (given by the file-path) into SAS
such as excel (.xls) files and export (.xpt) files. The proc import statement includes an out
argument, producing a data set. The replace option will overwrite any existing data set
with the same name.
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a001263753.htmhttp://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a001263753.htmhttp://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a001263753.htmhttp://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a001239776.htmhttp://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a001239776.htmhttp://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a001239776.htmhttp://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a001239776.htmhttp://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a001239776.htmhttp://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a001263753.htmhttp://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a001263753.htm8/13/2019 SAS Handbook
10/20
B. GetNames StatementSYNTAX: GETNAMES=(YES-OR-NO)
-This statement specifies whether or not proc import should take the first row of the
input data file as the list of variable names.
1.3STATEMENTS OUTSIDE OF DATA AND PROCEDURE STEPSA. Libname Statement
SYNTAX: LIBNAMENAMEFOLDER-PATH;
-The libname statement produces a library for permanent SAS data sets to be created by
data steps. A permanent SAS data set is created in a data step if it is named
NAME.dataset, where NAME is the nameof the library.
B. Options StatementSYNTAX: OPTIONS;
-The options statement can do things like change the line size, page orientation, etc.
8/13/2019 SAS Handbook
11/20
2SORTING,PRINTING,ANDSUMMARIZING DATA2.1PROC PRINT STEP
A. Proc Print Statement OptionsSYNTAX: PROCPRINT;
-The proc print step is usually used to show the observations of a data set in a list, while
giving the user several options. The proc print statement itself has a few options:
Data=data-setSpecifies which data set to print.
LabelPrompts SAS to use user-generated labels, whether they be created in the data-
sets data step or in this proc print step.
noobsRemoves the observation numbers in the print output.
B. ID StatementSYNTAX: IDVARIABLE(S);
-Designates that SAS use a particular variable or set of variables in printing instead of
observation numbers. If more than one variable is in the ID statement, more than one
group is printed.
C. By StatementSYNTAX: BYVARIABLE(S);
-The by statement specifies the ordering of the printing. If we desire the printing to be
done in a descending order of a variable, then we can add the Descending option before
the variable name. If more than one variable is listed, then the printing output is done in
a group format.
D. Sum StatementSYNTAX: SUMVARIABLE(S);
8/13/2019 SAS Handbook
12/20
-The sum statement totals the values of the given variable(s) and prints them in the
output window.
E. Title & Footnote StatementSYNTAX:TITLETEXTMESSAGE;
FOOTNOTETEXTMESSAGE;
-The title and footnote statements work the same way. The number specifies the
placementsmallest numbers indicate main titles/footnotes. It can also be used in many
other procedures with the same effect.
F. Var StatementSYNTAX:VARVARIABLE(S);
-The var statement specifies which variables to print and their order. It is used in many
other procedures.G. Sumby Statement
SYNTAX: SUMBYVARIABLES(S);
-The output print will include a sum for each variable listed in the sumby statement.2.2PROC FREQUENCY STEP
A. Proc Frequency Statement OptionsSYNTAX: PROCFREQUENCY;
-The frequency procedure is effective in analyzing categorical data as it provides
frequency counts, proportions, and can be used to perform chi-square tests. The order
option can take values data, formatted, freq, or internal. The data order is the one of the
appear FORMATTED: Sorted by order of formatting FREQ: Sorted by descending frequency count INTERNAL: Taking the order of the unformatted values DATA: Order in input data set
B. Weight Statement
8/13/2019 SAS Handbook
13/20
SYNTAX: WEIGHTVARIABLE;
-Specifying which numeric variable gives the counts of each observation in the input
data set.C. Tables Statement
SYNTAX: TABLES;
-The tables statement generates tables that can be one-way to n-way tables.
-ALPHA= option
Setting confidence level for confidence intervals-Binomial option
Getting binomial proportion, confidence limits, and tests if tables are one-way-Chisq option
Getting chi-square tests and statisticsD. Where Statement
SYNTAX: WHEREEXPRESSION-1;
-Producing proc frequency outputs only where the expression(s) return true values. Can
be used in many other procedures.2.3PROC CONTENTS STEP
A. Proc Contents Statement OptionsSYNTAX: PROCCONTENTS;
-The contents procedure produces a detailed description of a given data set, such as a
listing of variables with descriptions like length, type, etc.; number of observations in the
data set; etc.
2.4PROC TABULATE STEPA. Proc Tabulate Statement Options
SYNTAX: PROCTABULATE;
8/13/2019 SAS Handbook
14/20
-The tabulate procedure provides statistics that can be produced in other procedures,
but places them in a compact table/set of tables.
B. Class StatementSYNTAX: CLASSVARIABLE(S);
-The class statement is used in many procedures, it specifies one or more variables to be
grouped.
C. Var StatementSYNTAX: VARVARIABLE(S);
-The var statement is used in many procedures, it specifies one or more variables to be
analyzed, the method of which depending on the procedure.
D. Table StatementSYNTAX: TABLEVARIABLE(S);
-The class statement is used in many procedures, it specifies one or more variables to be
grouped.
2.5PROC SORT STEPA. Proc Sort Statement Options
SYNTAX: PROCSORT;
-The sort procedure sorts a data set by a variable specified by a nested by statement. It
is usually used before a new data set that will merge sorted data sets by a particular
variable.
2.6PROC GCHART STEPA. Proc GChart Statement Options
SYNTAX: PROCGCHART;
-The GChart procedure produces visual summaries of data in the form of charts. We can
produce block charts, horizontal and vertical bar charts, pie and donut charts, and star
charts.
8/13/2019 SAS Handbook
15/20
B. HBar, VBar and Vbar3d StatementsSYNTAX: HBARVARIABLE-1;
VBARVARIABLE-1;
VBAR3DVARIABLE-1;
-The HBar statement creates a horizontal bar chart for frequencies (default), sums, or
means. VBar is similar, only the bar charts are vertical.-The HBar3D statement creates a 3-d horizontal bar chart for frequencies (default),sums, or means. VBar3d is similar.
C. Block Statement-The block statement is very similar to the bar statements only that the block statement
produces visual summaries in the form of blocks instead of bars.
2.7PROC GPLOT STEPA. Proc GPlot Statement OptionsSYNTAX: PROC GPLOT ; --
-The GPLOT procedure produces visual summaries for data, this time on a set of axes.
B. Plot StatementSYNTAX: PLOT Y-VARIABLE*X-VARIABLE ;
-We can plot a y-variable against an x-variable very easily with the plot statement.
C. Symbol StatementSYNTAX:SYMBOL;The symbol statement helps us edit the gplot output.
2.8PROC FORMAT STEPA. Proc Format Statement OptionsSYNTAX:PROCFORMAT;The format procedure helps change appearance of output
B. Value StatementSYNTAX:VALUENAME;
-The value statement works to replace the original values with a format we specify. We can say a
set of values should take a specific format, whether it be a category, or even a renaming.
8/13/2019 SAS Handbook
16/20
C. Picture StatementSYNTAX: PICTURE NAME ;
The picture and value statements work similarly. Only the picture statement has the option of
retaining the original value of a variable in addition to adding a character or formatting. For
example, we can say 0.88 -
8/13/2019 SAS Handbook
17/20
3STATISTICAL ANALYSIS IN SAS3.1PROC UNIVARIATE STEP
A. Proc Univariate Statement OptionsSYNTAX:PROCUNIVARIATE;
The univariate procedure is effective in producing univariate statistical analysis on one ormore variables. Options include
Alpha=This option specifies a significance level for the provided 100(1-alpha)%
invtervals.
CIBASIC This option requests confidence intervals for the mean, standard deviation and
variance of specified variable(s) with the assumption they are normally
distributed.
Mu0=This option changes the hypothesized value from the default of 0 to a specified
value.
B. Var StatementSYNTAX:VAR;
This statement specifies a variable(s) for univariate analysis.C. Histogram StatementSYNTAX:HISTOGRAM;
The histogram statement produces a frequency bar chart for a specified variable(s). In theoptions field, we can specify a continuous distribution (ex. Normal, Exponential, etc.) and
the procedure will superimpose its estimate of the appropriate probability density curve,
and it will also provide goodness of fit tests.3.2PROC MEANS STEP
8/13/2019 SAS Handbook
18/20
A. Proc Means Statement OptionsSYNTAX:PROCMEANS;
The means procedure is a more compact version of the univariate procedure. The optionsfield is similar to that of the univariate procedure, but we can limit which statistics aredisplayed by listing them in the desired-statistics field (ex. N=# of observations, MEAN,
SUM, etc.).B. Var StatementSYNTAX:VARVARIABLE;
The Var statement works the same way here as it does in the univariate procedure.3.3PROC TTEST STEPA. Proc ttest Statement Options
SYNTAX:PROCTTEST;
The ttest procedure produces t-tests for single samples, paired observation sets, and twoindependent samples. The options are similar to those of the Means and Univariate
procedures.B. Class StatementSYNTAX:CLASSVARIABLE;
Just like in the frequency procedure, the class statement specifies a group variable for thettest procedure. This is required if we do analysis on two independent samples.
C. Var StatementSYNTAX:VARVARIABLE;
Again, the var statement works just as it does in many other procedures.D. Paired StatementSYNTAX:PAIREDVARIABLE-A*VARIABLE-B;
If we desire to perform analysis on a paired sample, we use the paired statement.
8/13/2019 SAS Handbook
19/20
8/13/2019 SAS Handbook
20/20
The GLM procedure is similar to the regression procedure, only it uses the method of leastsquares to fit general linear models.
B. LSMeans StatementSYNTAX:LSMEANSVARIABLE;
The LSMEANS statement calculates least squares means for each listed variable. It performsanalysis on them as well.
3.7PROC LOGISTIC STEPA. Proc Logistic StatementSYNTAX:PROCLOGISTIC;
The logistic procedure is useful in creating logistic models (a model to predict probabilitiesgiven explanatory variables) and producing analysis for them.
B. Class StatementSYNTAX:CLASSVARIABLE(S);
The class statement works in the logistic procedure similarly to how it does in previouslymentioned procedures.
C. Model StatementSYNTAX:MODELDEPENDENT-BINARY-VARIABLE=EFFECT(S);
The model statement works similarly to the model statement in the regression procedure,only the dependent variable need be binary in this case.
Recommended