The post Now is the time to get SAS certified appeared first on SAS Learning Post.
]]>This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
Yes, I read the baby books. I visited the websites. I swaddled a football. I was prepared. When my daughter came into the world last year there was a small comfort that I had done about all I could do to prepare for her arrival. I say a small comfort […]
The post Now is the time to get SAS certified appeared first on SAS Learning Post.
This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
The post 5 new books to go back to school with SAS appeared first on SAS Learning Post.
]]>This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
Whether you’re a high school student, a college student, a working professional looking to step up their SAS® game, or a lifelong learner who wants to explore analytics, statistics, and learn new skills, SAS Press has something for everyone this back-to-school season. A Recipe for Success Using SAS(R) University Edition: […]
The post 5 new books to go back to school with SAS appeared first on SAS Learning Post.
This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
The post Create an animation with the BY statement in PROC SGPLOT appeared first on The DO Loop.
]]>This post was kindly contributed by The DO Loop - go there to comment and to read the full post. |
It is easy to use PROC SGPLOT and BY-group processing to create an animated graph in SAS 9.4.
Sanjay Matange previously discussed
how to create an animated plot in SAS 9.4, but he used a macro loop to call PROC SGPLOT many times.
It is often easier to use the BY statement in SAS procedures to create many graphs.
Someone recently asked me how I created an animation that shows level sets of a contour plot. This article explains how to create an animation by using the BY statement in PROC SGPLOT.
An animation requires that you create a sequence of images. In SAS 9.4, you can create an animated GIF by using the ODS PRINTER destination. ODS does not care how the images are generated. They can be created by a macro loop. Or, as shown below, they can be generated by using the BY statement in PROC SGPLOT, SGRENDER, or any other procedure in SAS.
As an example, I will create the graph at the top of this article, which shows the annual time series for the stock price of three US companies for 20 consecutive years.
The data are contained in the Sashelp.Stocks data set. The following DATA step adds two new variables: Year and Month. The data are then sorted according to Date, which also sorts the data by Year.
data stocks; set sashelp.stocks; Month = month(date); /* 1, 2, 3, ..., 12 */ Year = year(date); /* 1986, 1987, ..., 2005 */ run; proc sort data=stocks; by date; run; |
I will create an animation that contains 20 frames. Each frame will be a graph that shows the stock performance for the three companies in a particular year. You can use PROC MEANS to discover that the stock prices are within the range [10, 210], so that range is used for the vertical axis:
ods graphics / imagefmt=GIF width=4in height=3in; /* each image is 4in x 3in GIF */ options papersize=('4 in', '3 in') /* set size for images */ nodate nonumber /* do not show date, time, or frame number */ animduration=0.5 animloop=yes noanimoverlay /* animation details */ printerpath=gif animation=start; /* start recording images to GIF */ ods printer file='C:\AnimGif\ByGroup\Anim.gif'; /* images saved into animated GIF */ ods html select none; /* suppress screen output */ proc sgplot data=stocks; title "Stock Performance"; by year; /* create 20 images, one for each year */ series x=month y=close / group=stock; /* each image is a time series */ xaxis integer values=(1 to 12); yaxis min=10 max=210 grid; /* set common vertical scale for all graphs */ run; ods html select all; /* restore screen output */ options printerpath=gif animation=stop; /* stop recording images */ ods printer close; /* close the animated GIF file */ |
The BY statement writes a series of images. They are placed into the animated GIF file that you specify on the FILE= option in the ODS PRINTER statement.
A few tricks are worth mentioning:
You can use a browser to view the image.
As I did in this blog post, you can embed the image in a web page.
Have fun creating your animations! Leave a comment and tell me about your animated creations.
The post Create an animation with the BY statement in PROC SGPLOT appeared first on The DO Loop.
This post was kindly contributed by The DO Loop - go there to comment and to read the full post. |
The post Free POC, on the latest SAS technology, in your own office appeared first on SAS Learning Post.
]]>This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
Ever wanted to test out the latest SAS® technology on your own data, but lacked the time or administrative backing of IT to make that happen? Ever wanted to test drive Hadoop with your own data and SAS® but are lacking the support or skills to do that? If I piqued […]
The post Free POC, on the latest SAS technology, in your own office appeared first on SAS Learning Post.
This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
The post If we didn't start the fire, then who did? appeared first on SAS Learning Post.
]]>This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
If you’re into 1980s pop music, then I bet you love Billy Joel’s song We Didn’t Start the Fire. But do you know every word, and the significance of every reference? Let’s use SAS software to create an interactive visualization that will help you fully understand this song! I first saw […]
The post If we didn’t start the fire, then who did? appeared first on SAS Learning Post.
This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
The post The perfect classroom companion for teaching SAS appeared first on SAS Learning Post.
]]>This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
It’s back to school time again. If you’re working to get your lesson plans in place, we have something that might help you and your students – Exercises and Projects for The Little SAS Book. This book is perfect for instructors and students in a classroom setting, especially when used […]
The post The perfect classroom companion for teaching SAS appeared first on SAS Learning Post.
This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
The post What were your #FirstSevenLanguages? appeared first on The SAS Dummy.
]]>This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post. |
My computer geek colleagues are boasting about their humble beginnings by sharing lists of their first seven programming languages. You can find these under the hashtag #FirstSevenLanguages.
COBOL
PL/1
SAS
IFPS
APL
370 Assembler
C
SQL
Lisp#FirstSevenLanguages— Paul Kent (@hornpolish) August 16, 2016
From what I’ve seen of these lists, the programming languages that appear are very much a function of age — not the age of the language, but of the person sharing the list. It’s also a function of industry. For people of a certain age who first worked at a bank, COBOL appears early on the list. Did you work in the defense industry? Ada is probably on your list.
Of course, the SAS programming language features prominently among my colleagues. I have argued that listing SAS is a bit of a cheat, since SAS actually comprises several different programming languages: DATA step, SQL, DS2, SAS macro, IML, GTL, SCL, and more. SAS also contains hooks into other languages like Lua and Groovy. Some SAS analytical procedures are programming languages in their own right, like PROC OPTMODEL.
I have several friends who have built their entire careers on SAS programming. There is little risk of boredom, as the SAS language evolves with each release and is used in virtually every industry. It’s like a huge mansion of a programming language — we all have our favorite rooms where we spend most of our time, but there are always new additions to discover and explore.
I’ve said that I don’t identify myself as a programmer, even though programming is an activity that occupies lots of my time. Here’s my #FirstSevenLanguages list. It’s not exactly in chronological order, and like other folks I’m cheating by grouping some languages together into eras.
Unlike some of my more distinguished colleagues, there are no “punch cards” languages on my list. Nostalgia is sometimes fun, but I don’t believe anyone who says that the era of punch cards, 16K RAM, and 8-inch floppy disks was “the good old days.” Instead, I prefer to look forward to my #NextSevenLanguages. In my current role with SAS Support Communities, I get to dabble in JavaScript, FreeMarker, and Python. But I use SAS every day and for so many tasks, it remains high on my list of languages to learn!
The post What were your #FirstSevenLanguages? appeared first on The SAS Dummy.
This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post. |
The post The smooth bootstrap method in SAS appeared first on The DO Loop.
]]>This post was kindly contributed by The DO Loop - go there to comment and to read the full post. |
Last week I showed how to use the simple bootstrap to randomly resample from the data to create B bootstrap samples, each containing N observations.
The simple bootstrap is equivalent to sampling from the empirical cumulative distribution function (ECDF) of the data. An alternative bootstrap technique is called the smooth bootstrap. In the smooth bootstrap you add a small amount of random noise to each observation that
is selected during the resampling process. This is equivalent to sampling from a kernel density
estimate, rather than from the empirical density.
The example in this article is adapted from Chapter 15 of Wicklin (2013), Simulating Data with SAS.
My previous article used the bootstrap method to investigate the sampling distribution of the skewness statistic for the SepalWidth variable in the Sashelp.Iris data. I used PROC SURVEYSELECT to resample the data and used PROC MEANS to analyze properties of the bootstrap distribution. You can also use SAS/IML to implement the bootstrap method.
The following SAS/IML statements creates 5000 bootstrap samples of the SepalWidth data. However, instead of computing a bootstrap distribution for the skewness statistic, this program computes a bootstrap distribution for the median statistic. The SAMPLE function enables you to resample from the data.
data sample(keep=x); set Sashelp.Iris(where=(Species="Virginica") rename=(SepalWidth=x)); run; /* Basic bootstrap confidence interval for median */ %let NumSamples = 5000; /* number of bootstrap resamples */ proc iml; use Sample; read all var {x}; close Sample; /* read data */ call randseed(12345); /* set random number seed */ obsStat = median(x); /* compute statistic on original data */ s = sample(x, &NumSamples // nrow(x)); /* bootstrap samples: 50 x NumSamples */ D = T( median(s) ); /* bootstrap distribution for statistic */ call qntl(q, D, {0.025 0.975}); /* basic 95% bootstrap CI */ results = obsStat || q`; print results[L="Bootstrap Median" c={"obsStat" "P025" "P975"}]; |
The SAS/IML program is very compact. The MEDIAN function computes the median for the original data. The SAMPLE function generates 5000 resamples; each bootstrap sample is a column of the s matrix.
The MEDIAN function then computes the median of each column. The QNTL subroutine computes a 95% confidence interval for the median as [28, 30]. (Incidentally, you can use PROC UNIVARIATE to compute distribution-free confidence intervals for standard percentiles such as the median.)
The following statement create a histogram of the bootstrap distribution of the median:
title "Bootstrap Distribution for Median"; call histogram(D) label="Median"; /* create histogram in SAS/IML */ |
I was surprised when I first saw a bootstrap distribution like this. The distribution contains discrete values. More than 80% of the bootstrap samples have a median value of 30. The remaining samples have values that are integers or half-integers.
This distribution is typical of the bootstrap distribution for a percentile. Three factors contribute to the shape:
The smooth bootstrap can analyze percentiles of rounded data #StatWisdom #SASTip
Click To Tweet
Although the bootstrap distribution for the median is correct, it is somewhat unsatisfying.
Widths and lengths represent continuous quantities. Consequently, the true sampling distribution of the median statistic is continuous.
The bootstrap distribution would look more continuous if the data had been measured with more precision. Although you cannot change the data, you can change the way that you create bootstrap samples. Instead of drawing resamples from the (discrete) ECDF, you can randomly draw samples from a kernel density estimate (KDE) of the data. The resulting samples will not contain data values. Instead, they will contains values that are randomly drawn from a continuous KDE.
You have to make two choices for the KDE: the shape of the kernel and the bandwidth. This article explores two possible choices:
For more about the smooth bootstrap, see Davison and Hinkley (1997) Bootstrap Methods and their Application.
For the iris data, the uniform kernel seems intuitively appealing. The following SAS/IML program defines a function named SmoothUniform that randomly chooses B samples and adds a random U(-h, h) variate to each data point. The medians of the columns form the bootstrap distribution.
/* randomly draw a point from x. Add noise from U(-h, h) */ start SmoothUniform(x, B, h); N = nrow(x) * ncol(x); s = Sample(x, N // B); /* B x N matrix */ eps = j(B, N); /* allocate vector */ call randgen(eps, "Uniform", -h, h); /* fill vector */ return( s + eps ); /* add random uniform noise */ finish; s = SmoothUniform(x, &NumSamples, 0.5); /* columns are bootstrap samples from KDE */ D = T( median(s) ); /* median of each col is bootstrap distrib */ BSEst = mean(D); /* bootstrap estimate of median */ call qntl(q, D, {0.025 0.975}); /* basic 95% bootstrap CI */ results = BSEst || q`; print results[L="Smooth Bootstrap (Uniform Kernel)" c={"Est" "P025" "P975"}]; |
The smooth bootstrap distribution (not shown) is continuous.
The mean of the distribution is the bootstrap estimate for the median. The estimate for this run is 29.8. The central 95% of the smooth bootstrap distribution is [29.77, 29.87]. The bootstrap estimate is close to the observed median, but the CI is much smaller than the earlier simple CI. Notice that the observed median (which is computed on the rounded data) is not in the 95% CI from the smooth bootstrap distribution.
Many researchers in density estimation state that the shape of the kernel function does not have a strong impact on the density estimate (Scott (1992), Multivariate Density Estimation, p. 141). Nevertheless, the following SAS/IML statements define a function called SmoothNormal that implements a smoothed bootstrap with a normal kernel:
/* Smooth bootstrap with normal kernel and sigma = h */ start SmoothNormal(x, B, h); N = nrow(x) * ncol(x); s = Sample(x, N // B); /* B x N matrix */ eps = j(B, N); /* allocate vector */ call randgen(eps, "Normal", 0, h); /* fill vector */ return( s + eps ); /* add random normal variate */ finish; s = SmoothNormal(x, &NumSamples, 0.25); /* bootstrap samples from KDE */ |
The mean of this smooth bootstrap distribution is
29.89. The central 95% interval is [29.86, 29.91]. As expected, these values are similar to the values obtained by using the uniform kernel.
In summary, the SAS/IML language provides a compact and efficient way to implement the bootstrap method for a univariate statistic such as the skewness or median. A visualization of the bootstrap distribution of the median reveals that the distribution is discrete due to the rounded data values and the statistical properties of percentiles. If you choose, you can “undo” the rounding by implementing the smooth bootstrap method. The smooth bootstrap is equivalent to drawing bootstrap samples from a kernel density estimate of the data. The resulting bootstrap distribution is continuous and gives a smaller confidence interval for the median of the population.
The post The smooth bootstrap method in SAS appeared first on The DO Loop.
This post was kindly contributed by The DO Loop - go there to comment and to read the full post. |
This post was kindly contributed by SAS & Statistics - go there to comment and to read the full post. |
This post was kindly contributed by SAS & Statistics - go there to comment and to read the full post. |
The post Female voters outnumber males in North Carolina appeared first on SAS Learning Post.
]]>This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
Since this is an election year, I’ve been scrutinizing the voter registration data. One thing that surprised me is there are more female voters registered in NC than males. I wondered if this was consistent across all 100 counties, and created some charts to help visualize the data… First I went […]
The post Female voters outnumber males in North Carolina appeared first on SAS Learning Post.
This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |