The post How to scare up a few good graphs for Halloween appeared first on SAS Learning Post.
]]>This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
Halloween appears be my favorite holiday, because I seem to have more graphs related to it than any of the others. And since Halloween is just a few days away, I thought you might like an easy way to see all those graphs. Here’s are links to my previous Halloween-related blog posts, containing […]
The post How to scare up a few good graphs for Halloween appeared first on SAS Learning Post.
This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
The post Create patterns of missing data appeared first on The DO Loop.
]]>This post was kindly contributed by The DO Loop - go there to comment and to read the full post. |
When simulating data or testing algorithms, it is useful to be able to generate patterns of missing data. This article shows how to generate random and systematic patterns of missing values. In other words, this article shows how to replace nonmissing data with missing data.
Create patterns of missing data in #SAS
Click To Tweet
The following SAS/IML program reads numerical data into a matrix from the Sashelp.Class data set.
The matrix has 16 rows and three columns. The program then generates a matrix of the same size that contains a random pattern of zeros and ones, where about 40% of the values will be ones. The LOC function is used to find the locations of the ones, and the corresponding locations in the data are set to missing:
proc iml; use Sashelp.Class; /* read numeric data into X */ read all var _NUM_ into X; close; /* random assignment of missing values */ RandX = X; /* copy data */ p = 0.4; /* approx proportion of missing elements */ call randseed(1234); B = randfun(dimension(X), "bern", p); /* random 0s or 1s */ missIdx = loc(B=1); /* find position of 1s */ if ncol(missIdx)>0 then RandX[missIdx] = .; /* replace 1s with missing */ print RandX; |
In this way, you can replace a certain percentage of the data values with missing values.
In the preceding section, the technique for inserting missing values does not use the fact that the matrix B is random. The technique works with any zero-one matrix B that specifies a pattern of missing values. For example, you can create a matrix that contains all combinations of zeros and ones, then use that pattern to set missing values, as follows:
C = { 0 0 0, 0 0 1, 0 1 0, 0 1 1, 1 0 0, 1 0 1, 1 1 0, 1 1 1 }; /* pattern matrix */ missIdx = loc(C=1); SysX = X; /* copy data */ if ncol(missIdx)>0 then SysX[missIdx] = .; /* replace 1s with missing */ print SysX; |
You could also specify the locations of the missing values by using subscripts of the data matrix. You can use the SUB2NDX function to convert subscripts to indices.
In the SAS DATA step you can use arrays to create a random pattern of missing values. For example, the following SAS data set reads numerical variables from the Sashelp.Class data and randomly assigns 40% of the data to missing values:
/* generate missing values in random locations */ data RandClass(drop=i); call streaminit(1234); set Sashelp.Class(keep=_NUMERIC_); array x {*} _numeric_; do i = 1 to dim(x); if rand("Bern", 0.4) then /* p=0.4 ==> about 40% missing */ x[i]=.; end; run; proc print; run; |
The output is not shown, but the random pattern is identical to the random pattern that was generated by using SAS/IML matrices.
You could use the DATA step to specify patterns of missing values for which there is a formula, such as every fourth data value (MOD(cnt,4)=1). However, it is less easy to generate an arbitrary pattern, such as the “all combinations” pattern in the previous section. In general, I think the SAS/IML approach is easier to use and more flexible.
For any pattern of missing values, you can use PROC MI to summarize the pattern.
You can also use various graphical techniques to visualize the pattern of missing data.
The post Create patterns of missing data appeared first on The DO Loop.
This post was kindly contributed by The DO Loop - go there to comment and to read the full post. |
The post What comes shipped free with Base SAS 9.4? appeared first on SAS Learning Post.
]]>This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
Like unexpectedly seeing this beautiful bird in nature, SAS has tons of free goodies you might be surprised to encounter as you explore your software. While individual organizations may have separate permutations and combinations of SAS software, I’d like to share the list of freebies that comes with a simple […]
The post What comes shipped free with Base SAS 9.4? appeared first on SAS Learning Post.
This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
The post The perfect storm for State Fair attendance! appeared first on SAS Learning Post.
]]>This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
The State Fair in North Carolina is just a few miles from SAS headquarters, and therefore it’s virtually impossible for it to slip by without me noticing it. There are two aspects of the fair that usually get lots of news coverage – what’s the latest fair-food, and did we […]
The post The perfect storm for State Fair attendance! appeared first on SAS Learning Post.
This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
The post New features in SAS Enterprise Guide 7.1 appeared first on SAS Learning Post.
]]>This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
SAS Enterprise Guide has come a long way since version 1.0 was released in 1999! Are any of you original users that remember the Help characters, Clippy, Peedy or Merlin? I was working as a statistician for another company that year, and I attended a SAS user group meeting where […]
The post New features in SAS Enterprise Guide 7.1 appeared first on SAS Learning Post.
This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
The post How much does it cost to live to 100? appeared first on SAS Learning Post.
]]>This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
Living to 100 isn’t as simple as just paying a certain amount of money for your healthcare. But that is an interesting aspect of longevity, so let’s have a look at the data … In my previous blog post, we analyzed how much people from various countries spend on healthcare. […]
The post How much does it cost to live to 100? appeared first on SAS Learning Post.
This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
The post Ahh, that's smooth! Anti-aliasing in SAS statistical graphics appeared first on The DO Loop.
]]>This post was kindly contributed by The DO Loop - go there to comment and to read the full post. |
I’ve written several articles about scatter plot smoothers: nonparametric regression curves that reveal small- and large-scale features of a response variable as a function of an explanatory variable.
However, there is another kind of “smoothness” that you might care about, and that is the apparent smoothness of curves and markers that are rendered on your computer screen or other device.
Recall that anti-aliasing is a graphical technique in which some pixels along a rendered curve are set to an intermediate color, which makes a curve look smoother. For example, if a curve is being drawn by using a black pen, some of the neighboring pixels along the rendered curve are set to shades of grey, which tricks the eye into seeing a smooth curve instead of a jagged, pixellated curve.
The ANTIALIAS= and ANTIALIASMAX= options were added to the ODS GRAPHICS statement in SAS 9.2.
A typical usage follows:
ods graphics / antialias=on antialiasmax=5000; |
The ANTIALIAS= option specifies whether to anti-alias. By default, anti-aliasing is on.
Because it can be expensive to anti-alias many thousands of graphical elements,
the ANTIALIASMAX= option enables you to specify the maximum number of elements (markers or curve points) that can be in a plot before anti-aliasing is disabled for that plot.
The default
value is ANTIALIASMAX=4000 for SAS 9.4m3. However, the default is only 600 for earlier releases, so you might want to bump up that value when you need a presentation-quality graphic that has thousands of graphical elements. If SAS disables anti-aliasing for a plot because the plot contains too many elements, the SAS log will contain a note similar to the following:
NOTE: Marker and line anti-aliasing has been disabled because the threshold has been reached. You can set ANTIALIASMAX=1000 in the ODS GRAPHICS statement to restore anti-aliasing.
A related option is turning on subpixel rendering by using the SUBPIXEL option.
The SUBPIXEL option was added to the
ODS GRAPHICS statement in SAS 9.4m3, but it has been available on the PROC SGPLOT statement for several 9.4 releases.
The SAS documentation for the ODS GRAPHICS statement says that the SUBPIXEL option “produces smoother curves and more precise bar spacing.” There is a section in the documentation titled “Subpixel Rendering,” which demonstrates the impact that subpixel rendering can have on curves and bar charts.
The documentation says that subpixel rendering “is enabled by default for image output, unless the graph contains a scatter plot or a scatter-plot matrix. In those cases, subpixel rendering is disabled by default.”
For me, subpixel rendering solves a problem that I’ve experienced when I create a large bar chart with many categories. The number of bars, the width of the bars, and the dimensions of the graph determine whether the number of pixels between bars is uniform or whether some gaps are larger than others. Sometimes you will see small uneven gaps between the bars, as shown on the left side of the following plot. However, subpixel rendering improves the plot tremendously, as shown on the right side:
In SAS 9.4m3 and beyond, the SUBPIXEL option applies to all plot types. Prior to SAS 9.4m3, the option applied only to line charts and bar charts; see the documentation of the PROC SGPLOT statement for the specific plots that were supported.
I think the best way to learn about anti-aliasing and subpixel rendering is to try it out yourself! These ODS options apply to all ODS statistical graphics, including those that are created by SAS analytical procedures. Remember, however, that the option was only available in the PROC SGPLOT statement for 9.4 releases prior to m3.
The following SAS statements enable you to play with the options and see the differences for a simple loess curve overlaid on a scatter plot:
ods graphics / reset ANTIALIAS=off; /* anti-aliasing off */ proc sgplot data=Sashelp.ENSO; loess y=Pressure x=Month / smooth=0.3 degree=2; run; ods graphics / ANTIALIAS=on ANTIALIASMAX=10000 SUBPIXEL=off; /* anti-aliasing on */ proc sgplot data=Sashelp.ENSO; loess y=Pressure x=Month / smooth=0.3 degree=2; run; ods graphics / ANTIALIAS=on ANTIALIASMAX=10000 SUBPIXEL=on; /* SAS 9.4m3 */ proc sgplot data=Sashelp.ENSO; loess y=Pressure x=Month / smooth=0.3 degree=2; run; |
The following resources provide further information about anti-aliasing and subpixel rendering in ODS graphics:
The post Ahh, that’s smooth! Anti-aliasing in SAS statistical graphics appeared first on The DO Loop.
This post was kindly contributed by The DO Loop - go there to comment and to read the full post. |
The post Which SAS course should I choose? appeared first on SAS Learning Post.
]]>This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
As an instructor for SAS, I receive a wide variety of queries before, during and after delivering my courses. Most frequently, I am asked questions such as: Should I learn SAS programming or a point and click tool instead? I know lots of code, should I go straight to the […]
The post Which SAS course should I choose? appeared first on SAS Learning Post.
This post was kindly contributed by SAS Learning Post - go there to comment and to read the full post. |
The post Loess regression in SAS/IML appeared first on The DO Loop.
]]>This post was kindly contributed by The DO Loop - go there to comment and to read the full post. |
A previous post discusses how the loess regression algorithm is implemented in SAS.
The LOESS procedure in SAS/STAT software provides the data analyst with options to control the loess algorithm and fit nonparametric smoothing curves through points in a scatter plot.
Although PROC LOESS satisfies 99.99% of SAS users who want to fit a loess model,
some research statisticians might want to extend or modify the standard loess algorithm.
Researchers like to ask “what if” questions like “what if I used a different weighting function?” or “what if I change the points at which the loess model is evaluated?”
Although the loess algorithm is complicated, it is not hard to implement a basic version in a matrix language like SAS/IML.
Implement a basic version of loess regression in SAS/IML. #SAStip
Click To Tweet
Recent blog posts have provided some computational modules that you can use to implement loess regression. For example, the PairwiseNearestNbr module finds the k nearest neighbors to a set of evaluation points.
The functions for weighted polynomial regression computes the loess fit at a particular point.
You can
download a SAS/IML program that defines the nearest-neighbor and weighted-regression modules.
The following call to PROC IML loads the modules and defines a function that fits a loess curve at
points in a vector, t. Each fit uses a local neighborhood that contains k data values. The local weighted regression is a degree-d polynomial:
proc iml; load module=(PairwiseNearestNbr PolyRegEst PolyRegScore); /* 1-D loess algorithm. Does not handle missing values. Input: t: points at which to fit loess (column vector) x, y: nonmissing data (column vectors) k: number of nearest neighbors used for loess fit d: degree of local regression model Output: column vector L[i] = f(t[i]; k, d) where f is loess model */ start LoessFit(t, x, y, k, d=1); m = nrow(t); Fit = j(m, 1); /* Fit[i] is predicted value at t[i] */ do i = 1 to m; x0 = t[i]; run PairwiseNearestNbr(idx, dist, x0, x, k); XNbrs = X[idx]; YNbrs = Y[idx]; /* X and Y values of k nearest nbrs */ /* local weight function where dist[,k] is max dist in neighborhood */ w = 32/5 * (1 - (dist / dist[k])##3 )##3; /* use tricubic weight function */ b = PolyRegEst(YNbrs, XNbrs, w`, d); /* param est for local weighted regression */ Fit[i] = PolyRegScore(x0, b); /* evaluate polynomial at x0 */ end; return Fit; finish; |
This algorithm provides some features that are not in PROC LOESS.
You can use this function to evaluate a loess fit at arbitrary X values, whereas PROC LOESS evaluates the function only at quantiles of the data.
You can use this function to fit a local polynomial regression of any degree (for example, a zero-degree polynomial), whereas PROC LOESS fits only first- and second-degree polynomials.
Although I hard-coded the standard tricubic weight function, you could replace the function with any other weight function.
On the other hand, PROC LOESS supports many features that are not in this proof-of-concept function, such as automatic selection of the smoothing parameter, handling missing values, and support for higher-dimensional loess fits.
Let’s use polynomials of degree 0, 1, and 2 to compute three different loess fits. The LoessData data set is defined in my previous article:
use LoessData; read all var {x y}; close; /* read example data */ s = 0.383; /* specify smoothing parameter */ k = floor(nrow(x)*0.383); /* num points in local neighborhood */ /* grid of points to evaluate loess curve (column vector) */ t = T( do(min(x), max(x), (max(x)-min(x))/50) ); Fit0 = LoessFit(t, x, y, k, 0); /* loess fit with degree=0 polynomials */ Fit1 = LoessFit(t, x, y, k, 1); /* degree=1 */ Fit2 = LoessFit(t, x, y, k, 2); /* degree=2 */ create Sim var {x y t Fit0 Fit1 Fit2}; append; close; QUIT; |
You can use PROC SGPLOT to overlay these loess curves on a scatter plot of the data:
title "Overlay loess curves computed in SAS/IML"; proc sgplot data=Sim; label Fit0="Loess (Deg=0)" Fit1="Loess (Deg=1)" Fit2="Loess (Deg=2)"; scatter x=x y=y; series x=t y=Fit0 / curvelabel; series x=t y=Fit1 / curvelabel lineattrs=(color=red); series x=t y=Fit2 / curvelabel lineattrs=(color=ForestGreen); xaxis grid; yaxis grid; run; |
The three curves are fairly close to each other on the interior of the data. The degree 2 curve wiggles more than the other two curves because it uses a higher-degree polynomial. The over- and undershooting becomes even more pronounced if you use cubic or quartic polynomials for the local weighted regressions.
The curious reader might wonder how these curves compare to curves that are created by PROC LOESS or by the LOESS statement in PROC SGPLOT. In the attached program I show that the IML implementation produces the same predicted values as PROC LOESS when you evaluate the models at the same set of points.
Most SAS data analysts are happy to use PROC LOESS. They don’t need to write their own loess algorithm in PROC IML. However, this article shows that IML provides the computational tools and matrix computations that you need to implement sophisticated algorithms, should the need ever arise.
The post Loess regression in SAS/IML appeared first on The DO Loop.
This post was kindly contributed by The DO Loop - go there to comment and to read the full post. |
The post Tip: How to close all data sets in SAS Enterprise Guide appeared first on The SAS Dummy.
]]>This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post. |
Have you seen this error when running a program in SAS Enterprise Guide?
ERROR: You cannot open WORK.YOURDATA.DATA for output access with member-level control because WORK.YOURDATA.DATA is in use by you in resource environment IOM ROOT COMP ENV.
It has a simple cause: the data set that your program is trying to write (or rewrite) is open in the data viewer. With regard to this data file, your program is in contention with the SAS Enterprise Guide application.
Usually SAS Enterprise Guide closes all open data sets before running a program or task, and that’s meant to help you avoid this error. But sometimes a data set file remains open for one reason or another, and the conflict results in the error message. Fortunately, there is a simple fix.
Select Tools->View Open Data Sets. The View Open Data Sets window shows the names of the data files that SAS Enterprise Guide has open. And it offers a convenient Close All button to clear the list. Closing the data doesn’t affect the contents of the file or its place in your project. It simply removes the lock that SAS Enterprise Guide is holding on the file.
If you are running multiple SAS Enterprise Guide sessions, it’s possible for one session to have a lock on a file that you’re trying to update in another session. The View Open Data Sets window shows only those data sets from your current session, so be sure to check your other projects if you’re multitasking.
The default behavior — close all data before running SAS programs — is controlled in Tools->Options->SAS Programs. If you don’t want SAS Enterprise Guide to close your data windows, clear that checkbox. (It’s difficult for me to imagine why you would do that…but hey, we have options for everything.)
The post Tip: How to close all data sets in SAS Enterprise Guide appeared first on The SAS Dummy.
This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post. |