How to unquote SAS character variable values was published on SAS Users.
]]>This post was kindly contributed by SAS Users  go there to comment and to read the full post. 
Before we delve into unquoting SAS character variables let’s briefly review existing SAS functionality related to the character strings quoting/unquoting.
Don’t be fooled by these macro functions’ names. They have nothing to do with quoting or unquoting character variables’ values. Moreover, they have nothing to do with quoting or unquoting even macro variables’ values. According to the %QUOTE Macro Function documentation it masks special characters and mnemonic operators in a resolved value at macro execution. %UNQUOTE Macro Function unmasks all special characters and mnemonic operators so they are interpreted as macro language elements instead of as text. There are many other SAS “macro quoting functions” (%SUPERQ, %BQUOTE, %NRBQUOTE, all macro functions whose name starts with %Q: %QSCAN, %QSUBSTR, %QSYSFUNC, etc.) that perform some action including masking.
Historically, however, SAS Macro Language uses terms “quote” and “unquote” to denote “mask” and “unmask”. Keep that in mind when reading SAS Macro documentation.
Most SAS programmers are familiar with the QUOTE function that adds quotation marks around a character value. It can add double quotation marks (by default) or single quotation marks if you specify that in its second argument.
This function goes even further as it doubles any quotation mark that already existed within the value to make sure that an embedded quotation mark is escaped (not treated as an opening or closing quotation mark) during parsing.
There is also a complementary DEQUOTE function that removes matching quotation marks from a character string that begins with a quotation mark. But be warned that it also deletes all characters to the right of the first matching quotation mark. In my view, deleting those characters is overkill because when writing a SAS program, we may not know what is going to be in the data and whether it’s okay to delete its part outside the first matching quotes. That is why you need to be extra careful if you decide to use this function. Here is an example of what I mean. If you run the following code:
data a; input x $ 150; datalines; 'This is what you get. Let's be careful.' ; data _null_; set a; y = dequote(x); put x= / y=; run; 
you will get the following in the SAS log:
y=This is what you get. Let
This is hardly what you really wanted as you have just lost valuable information – part of the y character value got deleted: 's be careful. I would rather not remove the quotation marks at all than remove them at the expense of losing meaningful information.
The $QUOTE informat does exactly what the DEQUOTE() function does, that is removes matching quotation marks from a character string that begins with a quotation mark. You can use it in the example above by replacing
y = dequote(x);
with the INPUT() function
y = input(x, $quote50.);
Or you can use it directly in the INPUT statement when reading raw data from datalines or an external file:
input x $quote50.;
Both, $QUOTE informat and DEQUOTE() function, in addition to removing all characters to the right of the closing quotation mark do the following unconventional, peculiar things:
If the described behavior matches your use case, you are welcome to use either $QUOTE informat or DEQUOTE() function. Otherwise, please read on.
Up to this point such a function did not exist, but we are about to create one to justify the title. Let’s keep it simple and straightforward. Here is what I propose our new unquote() function to do:
To summarize these specifications, our new UNQUOTE() function will extract a character substring within matching quotation marks if they are the first and the last nonblank characters in a character string. Otherwise, it returns the character argument unchanged.
Here is how such a function can be implemented using PROC FCMP:
libname funclib 'c:\projects\functions'; proc fcmp outlib=funclib.userfuncs.v1; /* outlib=libname.dataset.package */ function unquote(x $) $32767; pos1 = notspace(x); *< first nonblank character position; if pos1=0 then return (x); *< empty string; char1 = char(x, pos1); *< first nonblank character; if char1 not in ('"', "'") then return (x); *< first nonblank character is not " or ' ; posL = notspace(x, length(x)); *< last nonblank character position; if pos1=posL then return (x); *< single character string; charL = char(x, posL); *< last nonblank character; if charL^=char1 then return (x); *< last nonblank character does not macth first; /* at this point we should have matching quotation marks */ return (substrn(x, pos1 + 1, posL  pos1  1)); *< remove first and last quotation character; endfunc; run; 
Here are the highlights of this implementation:
We use multiple RETURN statements: we sequentially check for different special conditions and if one of them is met we return the argument value intact. The RETURN statement does not just return the value, but also stops any further function execution.
At the very end, after making sure that none of the special conditions is met, we strip the argument value from the matching quotation marks along with the leading and trailing blanks outside of them.
NOTE: SAS userdefined functions are stored in a SAS data set specified in the outlib= option of the PROC FCMP. It requires a 3level name (libref.datsetname.packagename) for the function definition location to allow for several versions of the samename function to be stored there.
However, when a userdefined function is used in a SAS DATA Step, only a 2level name can be specified (libref.datasetname). If that data set has several samename functions stored in different packages the DATA Step uses the latest function definition (found in a package closest to the bottom of the data set).
Let’s use the following code to test our newly minted userdefined function UNQUOE():
libname funclib 'c:\projects\functions'; options cmplib=funclib.userfuncs; data A; infile datalines truncover; input @1 S $char100.; datalines; ' " How about this? How about this? "How about this?" 'How about this?' "How about this?' 'How about this?" " How about this?" ' How about this?' ' How "about" this?' ' How 'about' this?' " How about this?" " How "about" this?" " How 'about' this?" ' How about this?' ; data B; set A; length NEW_S $100; label NEW_S = 'unquote(S)'; NEW_S = unquote(S); run; 
This code produces the following output table:
As you can see it does exactly what we wanted it to do – removing matching first and last quotation marks as well as stripping out blanks outside the matching quotation marks.
This INFILE statement’s option is particularly and extremely useful when using LIST input to read and unquote commadelimited raw data. In addition to removing enclosing quotation marks from character values, the DSD option specifies that when data values are enclosed in quotation marks, delimiters within the value are masked, that is treated as character data (not as delimiters). It also sets the default delimiter to a comma and treats two consecutive delimiters as a missing value.
In contrast with the above UNQUOTE() function, the DSD option will not remove enclosing quotation marks if there are same additional quotation marks present inside the character value. When DSD option does strip enclosing quotation marks it also strips leading and trailing blanks outside and within the removed quotation marks.
Have you found this blog post useful? Please share your use cases, thoughts and feedback in the comments below.
How to unquote SAS character variable values was published on SAS Users.
This post was kindly contributed by SAS Users  go there to comment and to read the full post. 
Using SAS® to analyze and visualize a quadratic ANCOVA model was published on SAS Users.
]]>This post was kindly contributed by SAS Users  go there to comment and to read the full post. 
Many examples for performing analysis of covariance (ANCOVA) are available in statistical textbooks, conference papers, and blogs. One of the primary interests for researchers and statisticians is to obtain a graphical representation for the fitted ANCOVA model. This post, coauthored with my SAS Technical Support colleague Kathleen Kiernan, demonstrates how to obtain the statistical analysis for a quadratic ANCOVA model by using the GLM procedure and graphical representation by using the SGPLOT procedure.
For the purposes of this blog, the tests for equality of intercepts and slopes have been omitted. For more detailed information about performing analysis of covariance using SAS, see SAS Note 24177, Comparing parameters (slopes) from a model fit to two or more groups.
Keep in mind that the statistical methods for ANCOVA are similar whether you have a oneway ANCOVA model or a repeatedmeasures ANCOVA model. The example data below consists of three treatments, the covariate X and the response value Y. Predicted observations were generated based upon augmenting observations to the training data set that is used to fit the final model.
data ts132; input trt x y @@; datalines; 1 2.2 3.7 1 3.5 4.7 1 4.1 5.3 1 4.1 5.5 1 7.4 5.7 1 3.0 3.8 1 0.5 1.0 1 2.4 3.3 1 8.4 5.3 1 8.5 5.3 1 8.0 6.1 1 6.3 5.8 2 5.8 7.2 2 9.9 2.5 2 1.2 4.3 2 6.0 6.9 2 5.5 8.0 2 9.7 3.7 2 8.6 5.2 2 5.0 8.0 2 2.6 6.9 2 8.4 4.3 2 3.0 6.8 2 5.2 7.6 3 8.8 5.7 3 9.0 6.1 3 8.8 6.4 3 4.1 9.6 3 2.4 9.4 3 4.1 8.2 3 4.4 10.2 3 8.9 6.0 3 5.3 9.4 3 9.5 5.0 3 1.1 7.5 3 0.1 5.3 ; run; data new; do trt=1 to 3; x=0; y=.; output; x=10; output; end; run; data combo; set ts132 new; run; 
The following statements fit the final model with common slope in the direction of x and unequal quadratic slope in direction of X*X.
proc glm data=combo; class trt; model y=trt x x*x*trt/solution noint; output out=anew p=predy; ods output parameterestimates=pe; quit; 
The Type III SS displays significance tests for the effects in the model.
The F statistic that corresponds to the TRT effect provides test intercepts (α_{1}=α_{2}=α_{3}=0). The F statistic that corresponds to X provides test β_{1}=0. The X*X*TRT effect provides test β_{1}=β_{2}=β_{3}=0. The significance levels are very small, indicating that there is sufficient evidence to reject the null hypothesis.
You can write the parameter estimates to a data set by using the PARAMETERESTIMATES output object in an ODS OUTPUT statement in order to display the model equations on a graph.
The default ODS graphics from PROC GLM provides the following graphic representation of the model:
However, researchers, managers and statisticians often want to modify the graph to include the model equation, to change thickness of the lines, symbols, titles and so on.
You can use PROC SGPLOT to plot the ANCOVA model and regression equations by using the OUTPUT OUT=ANEW data set from PROC GLM, as shown in this example:
proc sgplot data=anew; scatter y=y x=x /group=trt; pbspline y=predy x=x / group=trt smooth=0.5 nomarkers; run; 
This code produces the following graph:
You can see that the respective graphs from PROC GLM and PROC SGPLOT look similar.
It is easier to use PROC SGPLOT to enhance graphs rather than to do the same by editing the ODS Graphics Template associated with the GLM procedure.
To color coordinate the respective treatments with the regression model equations and display them in the graph, you can use SG annotation. The TEXT function is used to place the text at specific locations within the graph.
The POLYGON and POLYCONT functions are used to draw a border around the equations. Note that you might need to adjust the values for X1 and Y1 in the annotate data set for your specific output.
data eqn; set pe(keep=parameter estimate) end=last; retain trt1 trt2 trt3 x x_x1 x_x2 x_x3; length function $9 textcolor $30; x1space='datapercent'; y1space='wallpercent'; function='text'; x1=0; anchor='left'; width=120; select (compbl(parameter)); when ('trt 1') trt1=estimate; when ('trt 2') trt2=estimate; when ('trt 3') trt3=estimate; when ('x') do; if estimate <0 then x=cats(estimate,'x'); else x=cats(' + ',estimate,'x'); end; when ('x*x*trt 1') do; if estimate < 0 then x_x1=cats(estimate,'x*x*trt1'); else x_x1=cats(' + ',estimate,'x*x*trt1'); end; when ('x*x*trt 2') do; if estimate < 0 then x_x2=cats(estimate,'x*x*trt2'); else x_x2=cats(' + ',estimate,'x*x*trt2'); end; when ('x*x*trt 3') do; if estimate < 0 then x_x3=cats(estimate,'x*x*trt3'); else x_x3=cats(' + ',estimate,'x*x*trt3'); end; otherwise; end; if last then do; label=cats("Y=",trt1,x,x_x1); y1=3; textcolor="graphdata1:contrastcolor"; output; label=cats("Y=",trt2,x,x_x2); y1=10; textcolor="graphdata2:contrastcolor"; output; label=cats("Y=",trt3,x,x_x3); y1=17; textcolor="graphdata3:contrastcolor"; output; function='polygon'; x1=0; y1=0; linecolor='grayaa'; linethickness=1; output; function='polycont'; y1=21; output; x1=60; output; y1=0; output; end; run; proc sgplot data=anew sganno=eqn; scatter y=y x=x /group=trt; pbspline y=predy x=x / group=trt smooth=0.5 nomarkers; yaxis offsetmin=0.3; run; 
This code results in the following graph with model equations:
You might also want to add a title; change the X or Yaxis label; or change the color, thickness and pattern for the lines as well as the symbol for the data points, as follows:
The following PROC SGPLOT code produces a graph with the customizations described above:
ods graphics / attrpriority=none; title 'Fitted Plot for Quadratic Covariate with Unequal Slopes'; proc sgplot data=anew sganno=eqn; styleattrs datacontrastcolors=(navy purple darkgreen) datalinepatterns=(1 2 44) datasymbols=(squarefilled circlefilled diamondfilled); scatter y=y x=x /group=trt; pbspline y=predy x=x / group=trt smooth=0.5 nomarkers; yaxis offsetmin=0.3 label='Y value' labelattrs=(size=10pt color=black weight=bold); xaxis label='X value' labelattrs=(size=10pt color=black weight=bold); run; 
Keep in mind that you also need to adjust the color for the annotated text by changing the value for the TEXTCOLOR= variable (in the DATA step) from textcolor="graphdata1:contrastcolor" to reference the new colors, as shown in this example: textcolor="navy";
The resulting graph, including the modifications, is displayed below.
Using the output data set from your statistical procedure enables you to take advantage of the functionality of PROC SGPLOT to enhance your output. This post describes just some of the customizations you can make to your output.
Milliken, George. A., and Dallas E. Johnson. 2002. Analysis of Messy Data, Volume III: Analysis of Covariance. London: Chapman and Hall/CRC.
SAS Institute Inc. 2006. SAS Note 24529, “Modifying axes, colors, labels, or other elements of statistical graphs produced using ODS Graphics.” Available at support.sas.com/kb/24/529.html.
Using SAS® to analyze and visualize a quadratic ANCOVA model was published on SAS Users.
This post was kindly contributed by SAS Users  go there to comment and to read the full post. 
The post Restricted least squares regression in SAS appeared first on The DO Loop.
]]>This post was kindly contributed by The DO Loop  go there to comment and to read the full post. 
A data analyst recently asked a question about restricted least square regression in SAS.
Recall that a restricted regression puts linear constraints on the coefficients in the model.
Examples include forcing a coefficient to be 1 or forcing two coefficients to equal each other. Each of these problems can be solved by using PROC REG in SAS.
The analyst’s question was, “Why can I use PROC REG to solve a restricted regression problem when the restriction involves EQUALITY constraints, but PROC REG can’t solve a regression problem that involves an INEQUALITY constraint?”
This article shows how to use the RESTRICT statement in PROC REG to solve a restricted regression problem that has linear constraints on the parameters. I also use geometry and linear algebra to show that solving an equalityconstrained regression problem is mathematically similar to solving an unrestricted least squares system of equations. This explains why PROC REG can solve equalityconstrained regression problems.
In a future article, I will show how to solve regression problems that involve inequality constraints on the parameters.
The linear algebra for restricted least squares regression gets messy,
but the geometry is easy to picture. A schematic depiction of restricted regression is shown to the right.
Geometrically, ordinary leastsquares (OLS) regression is the orthogonal projection of the observed response (Y) onto the column space of the design matrix. (For continuous regressors, this is the span of the X variables, plus an “intercept column.”)
If you introduce equality constraints among the parameters, you are restricting the solution to a linear subspace of the span (shown in green). But the geometry doesn’t change. The restricted least squares (RLS) solution is still a projection of the observed response, but this time onto the restricted subspace.
In terms of linear algebra, the challenge is to write down the projection matrix for the restricted problem.
As shown in the diagram, you can obtain the RLS solution in two ways:
Let’s start with an example, which is based on an example in
the online textbook by A. Buteikis.
The following SAS DATA step simulates data from a regression model
Y = B0 + B1*X1 + B2*X2 + B3*X3 + B4*X4 + ε
where B3 = B1 and B4 = –2*B2 and ε ~ N(0, 3) is a random error term.
/* Model: Y = B0 + B1*X1 + B2*X2 + B3*X3 + B4*X4 where B3 = B1 and B4 = 2*B2 */ data RegSim; array beta[0:4] (5, 2, 3, 2, 6); /* parameter values for regression coefficients */ N = 100; /* sample size */ call streaminit(123); do i = 1 to N; x1 = rand("Normal", 5, 2); /* X1 ~ N(5,2) */ x2 = rand("Integer", 1, 50); /* X2 ~ random integer 1:50 */ x3 = (i1)*10/(N1); /* X3 = sequence: 0 to 10 by 1/N */ x4 = rand("Normal", 10, 3); /* X4 ~ N(10, 3) */ eps = rand("Normal", 0, 3); /* RMSE = 3 */ Y = beta[0] + beta[1]*x1 + beta[2]*x2 + beta[3]*x3 + beta[4]*x4 + eps; output; end; run; 
Although there are five parameters in the model (B0B4), there are only three free parameters because the values of B2 and B4 are determined by the values of B1 and B4, respectively.
If you suspect that a parameter in your model is a known constant or is equal to another parameter, you can use the RESTRICT statement in PROC REG to incorporate that restriction into the model. Before you do, however, it is usually a good idea to perform a statistical test to see if the data supports the model. For this example, you can use the TEST statement in PROC REG to hypothesize that B3 = B1 and B4 = –2*B2. Recall that the syntax for the TEST statement uses the variable names (X1X4) to represent the coefficients of the variable.
The following call to PROC REG carries out this analysis:
proc reg data=RegSim plots=none; model Y = x1  x4; test x3 = x1, x4 = 2*x2; run; 
The output shows that the parameter estimates for the simulated data are close to the parameter values used to generate the data. Specifically, the root MSE is close to 3, which is the standard deviation for the error term. The Intercept estimate is close to 5. The coefficients for the X1 and X3 term are each approximately 2.
The coefficient for X4 is approximately –2 times the coefficient of X2.
The F test for the test of hypotheses has a large pvalue, so you should not reject the hypotheses that B3 = B1 and B4 = –2*B2.
If the hypothesis on the TEST statement fails to reject, then you can use the RESTRICT statement to recompute the parameter estimates subject to the constraints. You could rerun the entire analysis or, if you are running SAS Studio in interactive mode, you can simply submit a RESTRICT statement and PROC REG will compute the new parameter estimates without rereading the data:
restrict x3 = x1, x4 = 2*x2; /* restricted least squares (RLS) */ print; quit; 
The new parameter estimates enforce the restrictions among the regression coefficients. In the new model, the coefficients for X1 and X3 are exactly equal, and the coefficient for X4 is exactly –2 times the coefficient for X2.
Notice that the ParameterEstimates table is augmented by two rows labeled “RESTRICT”. These rows have negative degrees of freedom because they represent constraints. The “Parameter Estimate” column shows the values of the Lagrange multipliers that are used to enforce equality constraints while solving the least squares system. Usually, a data analyst does not care about the actual values in these rows, although the Wikipedia article on Lagrange multipliers discusses ways to interpret the values.
This section shows the linear algebra behind the restricted least squares solution by using SAS/IML. Recall that the usual way to compute the unrestricted OLS solution is the solve the “normal equations” (X`*X)*b = X`*Y for the parameter estimates, b. Although textbooks often solve this equation by using the inverse of the X`X matrix, it is more efficient to use the SOLVE function in SAS/IML to solve the equation, as follows:
proc iml; varNames = {x1 x2 x3 x4}; use RegSim; /* read the data */ read all var varNames into X; read all var "Y"; close; X = j(nrow(X), 1, 1)  X; /* add intercept column to form design matrix */ XpX = X`*X; /* Solve the unrestricted OLS system */ b_ols = solve(XpX, X`*Y); /* normal equations X`*X * b = X`*Y */ ParamNames = "Intercept"  varNames; print b_ols[r=ParamNames F=10.5]; 
The OLS solution is equal to the first (unrestricted) solution from PROC REG.
If you want to require that the solution satisfies B3 = B1 and B4 = –2*B2, then you can augment the X`X matrix to enforce these constraints. If L*λ = c is the matrix equation for the linear constraints, then the augmented system is
\(
\begin{pmatrix}
X^{\prime}X & L^{\prime} \\
L & 0
\end{pmatrix}
\begin{pmatrix}
b_{\small\mathrm{RLS}} \\
\lambda
\end{pmatrix}
=
\begin{pmatrix}
X^{\prime}Y \\
c
\end{pmatrix}
\)
The following statements solve the augmented system:
/* Direct method: Find b_RLS by solving augmented system */ /* Int X1 X2 X3 X4 */ L = {0 1 0 1 0, /* X1  X3 = 0 */ 0 0 2 0 1}; /* 2*X2  X4 = 0 */ c = {0, 0}; zero = j(nrow(L), nrow(L), 0); A = (XpX  L` ) // (L  zero ); z = X`*Y // c; b_rls = solve(A, z); rNames = ParamNames{'Lambda1' 'Lambda2'}; print b_rls[r=rNames F=10.5]; 
The parameter estimates are the same as are produced by using PROC REG. You can usually ignore the last two rows, which are the values of the Lagrange multipliers that enforce the constraints.
By slogging through some complicated linear algebra, you can also obtain the restricted least squares solution from the OLS solution and the constraint matrix (L). The following
equations use the formulas in
the online textbook by A. Buteikis to compute the restricted solution:
/* Adjust the OLS solution to obtain the RLS solution. b_RLS = b_OLS  b_adjustment Follows http://web.vu.lt/mif/a.buteikis/wpcontent/uploads/PE_Book/44MultipleRLS.html */ RA_1 = solve(XpX, L`); RA_3 = solve(L*RA_1, L*b_ols  c); b_adj = RA_1 * RA_3; b_rls = b_ols  b_adj; print b_rls[r=ParamNames F=10.5]; 
This answer agrees with the previous answer but does not compute the Lagrange multipliers. Instead, it computes the RLS solution as the projection of the OLS solution onto the restricted linear subspace.
This article shows how to use the RESTRICT statement in PROC REG to solve a restricted regression problem that involves linear constraints on the parameters. Solving an equalityconstrained regression problem is very similar to solving an unrestricted least squares system of equations. Geometrically, they are both projections onto a linear subspace. Algebraically, you can solve the restricted problem directly or as the projection of the OLS solution.
Although changing one of the equality constraints into an INEQUALITY seems like a minor change, it means that the restricted region is no longer a linear subspace. Ordinary least squares can no longer solve the problem because the solution might not be an orthogonal projection. The next article discusses ways to solve problems where the regression coefficients are related by linear inequalities.
The post Restricted least squares regression in SAS appeared first on The DO Loop.
This post was kindly contributed by The DO Loop  go there to comment and to read the full post. 
Using REST API to transform a Visual Analytics Report was published on SAS Users.
]]>This post was kindly contributed by SAS Users  go there to comment and to read the full post. 
There are three types of visualization APIs defined in the SAS Viya REST API reference documetation: Reports, Report Images and Report Transforms. You may have seen the posts on how to use Reports and Report Images. In this post, I’m going to show you how to use the Report Transforms API. The scenario I am using changes the data source of a SAS Visual Analytics report and saves the transformed report.
The Report Transforms API provides simple alterations to SAS Visual Analytics reports, and it uses the ‘application/vnd.sas.report.transform’ media type for the transformation (passed in the REST API call header). When part of a request, the transform performs editing or modifications to a report. If part of a response, the transform describes the operation performed on the report. Some typical transformations include:
To use the Transforms API, we need to properly set the request body and some attributes for the transform. After the transform, the response contains the transformed report or a reference to the report.
This step is very straightforward. In SAS Visual Analytics, create a report and save it to a folder (instructions on creating a report are found in this video). For this example, I’ll use the ‘HPS.CARS’ table as the data source and create a bar chart. I save the report with name ‘Report 1’ in ‘My Folder’. I’ll use this report as the original report in the transform.
I will use PROC HTTP to call the Transforms API using the ‘POST’ method and appending the URL with ‘/reportTransforms/dataMappedReports‘. The call needs to set the request body.
/* data source identification for original report */ { "namePattern": "serverLibraryTable", "purpose": "original", "server": "casshareddefault", "library": "HPS", "table": "CARS" } /* data source identification for transformed report */ { "namePattern": "serverLibraryTable", "purpose": "replacement", "server": "casshareddefault", "library": "CASUSER", "table": "CARS_NEW", "replacementLabel": "NEW CARS", "dataItemReplacements": [ { "originalColumn": "dte", "replacementColumn": "date" }, { "originalColumn": "wght", "replacementColumn": "weight" }, { "originalColumn": "dest", "replacementColumn": "region" } ] } 
Besides the request body, we need to set some other attributes for the transform API when changing the data source. These include ‘useSavedReport’, ‘saveResult’, ‘failOnDataSourceError’ and ‘validate’.
I’ll save the transformed report with the name ‘Transformed Report 1’ in the same folder as the original ‘Report 1’. I set the ‘resultReportName’ to ‘Transformed Report 1’, and set the ‘resultReport’ with ‘name” and ‘description’ attributes. I also need to get the folderURI of the ‘My Folder’ directory. You may refer my previous post to see how to get the folderURI using REST APIs.
Below is the section of the settings for the target report and folder:
"resultReportName": "Transformed Report 1", "resultParentFolderUri": "/folders/folders/cf981702fb8f4c6fbef3742dd898a69c", "resultReport": { "name": "Transformed Report 1", "description": "TEST report transform" } 
Now, we have set all the necessary parameters for the transform and are ready to run the transform. I put my entire set of code on GitHub. Running the code creates the ‘Transformed Report 1’ report in ‘My Folder’, with the data source changing to CASUSER.CARS_NEW’, containing the three mapped columns.
If the API failed to create the transformed report, the PROC SQL statements displays an error code and error message. For example, if the replacement data source is not valid, it returns errors similar to the following.
If the API successfully creates the transformed report, “201 Created” is displayed in the log. You may find more info about the transformed report from the response body of tranFile from PROC HTTP. You can also log into the SAS Visual Analytics user interface to check the transformed report is opened successfully, and the data sources are changed as expected. Below is the screenshot of the original report and transformed report. You may have already noticed they use different data sources from data labels.
There are a wealth of other transformations available through the Report Transform APIs. Check out the SAS Developer site for more information.
Using REST API to transform a Visual Analytics Report was published on SAS Users.
This post was kindly contributed by SAS Users  go there to comment and to read the full post. 
Building custom apps on top of SAS Viya, Part Three: Choosing an OAuth flow was published on SAS Users.
]]>This post was kindly contributed by SAS Users  go there to comment and to read the full post. 
This is the third installment of a series focused on discussing the secure integration of custom applications into your SAS Viya platform. I began the series with an example of how a recent customer achieved valueadd on top of their existing investments by building a custom app for their employees and embedding repeatable analytics from SAS Viya into it. Then in the second part, I outlined the security frameworks and main integration points used to accomplish that.
As a quick recap, in the last post I covered how to integrate customwritten applications with the SAS Viya platform using scoped access tokens from SAS Logon Manager. I outlined the returned access tokens need to be included with the REST calls those custom apps make to SAS Viya endpoints and resources. The access tokens contain the necessary information SAS Viya will use to make authorization decisions for every call your app makes. Administrators of your SAS Viya environment have complete control over the rules that determine the outcome of those decisions. I also introduced the primary decision you need to make before you begin integrating your custom app into your SAS Viya environment: whether those access tokens will be scoped in the context of your application (generalscoped) or in the context of each loggedin user of your application (userscoped).
Now let’s go a bit deeper into the authorization flows for both scenarios. TL;DR at the bottom.
Quick disclaimer: The rest of this post assumes you are familiar with the SAS Administration concepts of Custom Groups, granting access to SAS Content, and creating authorization rules. If those topics are unfamiliar to you and you will be implementing this custom application registration, please take a look at the resources linked inline.
First, when I say “generalscoped access” here, I mean cases where you don’t want end users to have to log in (directly to SAS Viya, at least). You don’t want the tokens returned by SAS Logon to your custom applications generating authorization decisions based on the individual using your application. In these cases, however, you do still want control over the endpoints/resources your app calls and which CRUD operations it performs on them. Any situation where you’d like to expose some content you’ve created in SAS Viya to a large scale of users, particularly those that wouldn’t have valid login credentials to your SAS Viya environment, fall in this category. There are two main “flows” available to make this happen. Here, “flow” is a term used to describe the process for accessing a protected resource within the OAuth framework.
Password flow
I’ll won’t go into much detail on this one because the folks behind OAuth 2.0 are suggesting the phaseout of the use of this flow (source one) (source two). Though it technically works for this use case, it does so by essentially using a “service account” so all REST calls your application makes are within the scope of a single UID. You could create specific authorization grants for that user (or a group it’s a member of) to make this functional, but it requires that you include the username and password as clear text URL parameters when you request your token. It’s just not a great idea and not secure to send this over the wire even if you are using HTTPS. It also creates another sensitive environment variable to manage and, worse, I have even seen this hard coded it into custom application logic.
Client Credentials flow
Interestingly enough, the combination of the security concerns with the Password flow, while still needing to provide specific access levels for a custom application our team was building, is how I found myself wanting to write this series. If you’re reading this, you’ve likely checked out the page on how to register clients to SAS Logon Manger on developer.sas.com. These instructions do not cover the Client Credintials flow though Mike Roda’s SAS Global Forum paper, OpenID Connect Opens the Door to SAS Viya APIs, does provide guidance on the client credentials grant type. Remember, these types of scenarios don’t warrant concern for which individual users are making the calls into SAS Viya; however, you still will want to restrict which calls your app makes, so follow leastprivilege access best practices.
The secret sauce to making this work is registering your application to SAS Logon Manager using a specific, optional key called authorities
and setting its value to be a list of custom group(s) giving your application the access to the proper resources and endpoints. Here’s where some prior knowledge of creating authorization rules for specific URIs using endpoint patterns and specific principals comes in handy. Check out the documentation linked above if those are unfamiliar terms. We’ll see examples of how to implement this flow in the next post in this series.
The diagram below illustrates this flow. Viewing hint: click on the photo to focus on it, and click on the fullscreen button to enlarge it.
As a quick recap, here I’m talking about the cases where it does make sense for your application’s end users to log in to SAS. All subsequent API calls into SAS Viya resources your application makes should be within the context of what they as an individual user is allowed to do/see/access in SAS Viya. For example, if a user is not in the SAS Administrator’s group, then you’ll want to ensure any REST calls your custom app makes for that user that would require him to be in the SAS Administrator’s group will indeed fail.
Admittedly, this takes more work on the application development side to implement; however, as far as creating and registering the client for your application, this isn’t any more complicated than what I’ve already outlined in this post. By the nature of the flow I’ll outline in a minute, the application developer is going to need to add additional logic to handle sessions and/or cookies to obtain, store, check, update, and eventually revoke individual user’s authorization codes. When using SAS Viya applications this all occurs behind the scenes. For example, when you first navigate to SAS Visual Analytics (VA) you are redirected to SAS Logon Manager to log in and get your userscoped access token. Then you are redirected back to VA with your token. In turn, VA and the supporting services use this token for any subsequent REST calls to other SAS Viya endpoints and resources. When creating customdeveloped apps you integrate with SAS Viya for userscoped access, we need to recreate that bakedin behavior. This is the process I outline below.
Authorization Code flow
A shortenedforreadability version of how this works is:
Joe Furbee covers the authorization code process in detail in Authentication to SAS Viya: a couple of approaches. The whole point of using this flow in this manner is to ensure that SAS Logon Manager is responsible for handling individual user credentials rather than your application, which by contrast would then require submitting them over the wire. The diagram below represents the flow.
Password flow
In fairness, this technically works here as well, but we already know the drill.
My goal between this post (and in part 2) was to introduce the necessary planning steps prior to registering a customdeveloped application to SAS Logon Manager, so it can make REST API calls to SAS Viya endpoints and resources. To recap, the steps are:
1. Decide whether users should have to log into SAS Viya or not.
2A. Consider using the Client Credentials flow. Set up the security principal(s) and authorization rule(s) to give your application the appropriate permissions to the needed resources and then register your client with the relevant settings.
2B. In this case, you should consider using the Authorization Code flow. Enhance your custom application’s ability to request and handle user and sessionspecific authorization codes and tokens, substituting them into subsequent API calls when needed, and then register your client with the relevant settings.
_
In either case, I’ll cover examples in the next post in this series.
Building custom apps on top of SAS Viya, Part Three: Choosing an OAuth flow was published on SAS Users.
This post was kindly contributed by SAS Users  go there to comment and to read the full post. 
The post Video: How to Write a Custom Parallel Program in SAS Viya appeared first on The DO Loop.
]]>This post was kindly contributed by The DO Loop  go there to comment and to read the full post. 
My 2020 SAS Global Forum paper was about how to write custom parallel programs
by using the iml action in SAS Viya 3.5.
My conference presentation was canceled because of the coronavirus pandemic,
but I recently recorded a 15minute video that summarizes the main ideas in the paper.
One of the reasons I enjoy attending conferences is to obtain a highlevel “mental roadmap” of a topic that I can leverage if I decide to study the details.
Hopefully, this video will provide you with a roadmap of the iml action and its capabilities.
If your browser does not support embedded video, you can go directly to the video on YouTube.
The video introduces a few topics that I have written about in more detail:
For more details you can read the paper: Wicklin and Banadaki, 2020, “Write Custom Parallel Programs by Using the iml Action.”
The post Video: How to Write a Custom Parallel Program in SAS Viya appeared first on The DO Loop.
This post was kindly contributed by The DO Loop  go there to comment and to read the full post. 
The post Iterative proportional fitting in SAS appeared first on The DO Loop.
]]>This post was kindly contributed by The DO Loop  go there to comment and to read the full post. 
I previously wrote about the RAS algorithm, which is a simple algorithm that performs matrix balancing.
Matrix balancing refers to adjusting the cells of a frequency table to match known values of the row and column sums. Ideally, the balanced matrix will reflect the structural relationships in the original matrix.
An alternative method is the
iterative proportional fitting (IPF) algorithm, which is implemented in the IPF subroutine in SAS/IML.
The IPF method can balance nway tables, n ≥ 2.
The IPF function is a statistical modeling method. It computes
maximum likelihood estimates for a hierarchical loglinear model of the counts as a function of the categorical variables. It is more complicated to use than the RAS algorithm, but it is also more powerful.
This article shows how to use the IPF subroutine for the twoway frequency table from my previous article. The data are for
six girl scouts who sold seven types of cookies.
We know the row totals for the Type variable (the boxes sold for each cookie type) and the column totals for the Person variable (the boxes sold by each girl). But we do not know the values of the individual cells, which are the TypebyPerson totals.
A matrix balancing algorithm takes any guess for the table and maps it into a table that has the observed row and column sums.
As mentioned in the previous article, there are an infinite number of matrices that have the
observed row and column totals.
Before we discuss the IPF algorithm, I want to point out that it is easy to compute one particular balanced matrix. The special matrix is the one that assumes no association (independence) between the Type and Person variables. That matrix is formed by the
the outer product of the marginal totals:
B[i,j] = (sum of row i)(sum of col j) / (sum of all counts)
It is simple to compute the “noassociation” matrix in the SAS/IML language:
proc iml; /* Known quantities: row totals and column totals */ u = {260, 214, 178, 148, 75, 67, 59}; /* total boxes sold, by type */ v = {272 180 152 163 134 100}; /* total boxes sold, by girl */ /* Compute the INDEPENDENT table, which is formed by the outer product of the marginal totals: B[i,j] = (sum of row i)(sum of col j) / (sum of all counts) */ B = u*v / sum(u); Type = 'Cookie1':'Cookie7'; /* row labels */ Person = 'Girl1':'Girl6'; /* column labels */ print B[r=Type c=Person format=7.1 L='Independent Table']; 
For this balanced matrix, there is no association between the row and column variables (by design). This matrix is the “expected count” matrix that is used in a chisquare test for independence between the Type and Person variables.
Because the cookie types are listed in order from most popular to least popular,
the values in the balanced matrix (B) decrease down each column. If you order the girls according to how many boxes they sold (you need to exchange the third and fourth columns), then the values in the balanced matrix would decrease across each row.
The documentation for the IPF subroutine states that you need to specify the following input arguments. For clarity, I will only discuss
twoway r x c tables.
The following SAS/IML program defines the parameters for the IPF call.
/* Specify structure between rows and columns by asking each girl to estimate how many boxes of each type she sold. This is the matrix whose cells will be adjusted to match the marginal totals. */ /* G1 G2 G3 G4 G5 G6 */ A = {75 45 40 40 40 30 , /* C1 */ 40 35 45 35 30 30 , /* C2 */ 40 25 30 40 30 20 , /* C3 */ 40 25 25 20 20 20 , /* C4 */ 30 25 0 10 10 0 , /* C5 */ 20 10 10 10 10 0 , /* C6 */ 20 10 0 10 0 0 }; /* C7 */ Dim = ncol(A)  nrow(A); /* dimensions of r x c table */ Config = {1 2}; /* model main effects: counts depend on Person and Type */ call IPF(FitA, Status, /* output values */ Dim, B, /* use the "no association" matrix to specify marginals */ Config, A); /* use bestguess matrix to specify structure */ print FitA[r=Type c=Person format=7.2], /* the IPF call returns the FitA matrix and Status vector */ Status[c={'rc' 'Max Diff' 'Iters'}]; 
The IPF call returns two quantities. The first output is the balanced matrix. For this example, the balanced matrix (FitA) is identical to the result from the RAS algorithm. It has a similar structure to the “best guess” matrix (A). For example, and notice that the cells in the original matrix that are zero are also zero in the balanced matrix.
The second output is the Status vector.
The status vector provides information about the convergence of the IPF algorithm:
The IPF subroutine is more powerful than the RAS algorithm because you can fit nway tables and because you can model the table in several natural ways. For example, the model in the previous section assumes that the counts depend on the salesgirl and on the cookie type. You can also model the situation where the counts depend only on one of the factors and not on the other:
/* counts depend only on Person, not on cookie Type */ Config = {1}; call IPF(FitA1, Status, Dim, B, Config, A); print FitA1[r=Type c=Person format=7.2], Status; /* counts depend only on cookie Type, not on Person */ Config = {2}; call IPF(FitA2, Status, Dim, B, Config, A); print FitA2[r=Type c=Person format=7.2], Status; 
In summary, the SAS/IML language provides a builtin function (CALL IPF) for matrix balancing of frequency tables.
The routine incorporates statistical modeling and you can specify which variables to use to model the cell counts.
The documentation several examples that demonstrate using the IPF algorthm to balance nway tables, n ≥ 2.
The post Iterative proportional fitting in SAS appeared first on The DO Loop.
This post was kindly contributed by The DO Loop  go there to comment and to read the full post. 
Building custom apps on top of SAS Viya, Part Two: Understanding the integration points was published on SAS Users.
]]>This post was kindly contributed by SAS Users  go there to comment and to read the full post. 
Welcome back! Today, I continue with part 2 of my series on building custom applications on SAS Viya. The goal of this series is to discuss securely integrating customwritten applications into your SAS Viya platform.
In the first installment of this series, I outlined my experiences on a recent project. In that example, our team worked with a preowned auto mall to create a custom application for their buyers to use at auction, facilitating datadriven decisions and maximizing potential profits. Buyers use the app’s UI to select/enter the characteristics of the specific car on which they’d like to bid. Those details are then sent via an HTTP REST call to a specific API endpoint in the customer’s SAS Viya environment which exposes a packaged version of a detailed decision flow. Business rules and analytic models evaluate the car details as part of that flow. The endpoint returns a maximum bid recommendation and other helpful information via the custom applications’s UI.
If you haven’t already, I encourage you to read part 1 for more details.
If the concept of exposing endpoints to custom applications elicits some alarming thoughts for you, fear not. The goal of this post is to introduce the securityrelated considerations when integrating custom applications into your SAS Viya environment. In fact, that is the topic of the next two sections. Today we’ll take a look at the security underpinnings of SAS Viya so we can start to flesh out how that integration works. In part 3, we’ll dive into more specific decision points. To round things out, the series finishes with some examples of the concepts discussed in parts 13.
So how does SAS Viya know it can trust your custom app? How does it know your app should have the authority to call specific API endpoints in your SAS Viya environment and what HTTP methods it should allow? Moreover, who should be able to make those REST calls, the app itself or only individual loggedin users of that app? To answer these questions, we need to understand the foundation of the security frameworks in SAS Viya.
Security in SAS Viya is built around OAuth 2.0 (authorization) and OpenID Connect (authentication), both of which are industrystandard frameworks. The details of how they work, separately and together, are out of the scope of this post. We will, however, delve into a few key things to be aware of within their implementation context for SAS Viya.
First, SAS Logon Manager is the centerpiece. When a client (application) requests access to endpoints and resources, they must first communicate with SAS Logon Manager which returns a scoped access token if it was a valid request. This is true both within SAS Viya (like when internal services hosting APIs communicate with each other) and for external applications like we’re talking about here. The client/application subsequently hands that token in with their request to the desired endpoint.
The long and short of it is that we don’t want users of our custom applications to see or manipulate anything they shouldn’t, so these tokens need to be scoped. The authorization systems in SAS Viya need to be able to understand what the bearer of that specific token should be able to do. For a detailed overview of how SAS Viya services leverage OAuth and OpenID Connect tokens for authentication, I’d recommend reading this SAS Global Forum paper from SAS’ Mike Roda.
Later in this series, we’ll see examples of obtaining a scoped token and then using it to make REST calls to SAS Viya API endpoints, like the auto mall’s custom app does to return recommendations for its users. For now, suffice it to say the scope of the token is the critical component to answering our questions regarding how SAS Viya returns specific responses to our custom application.
We’ve established that native SAS Viya applications and services register themselves as clients to SAS Logon Manager. When a client requests access to an endpoint and/or resource, they must first communicate with SAS Logon Manager which returns a scoped access token if it was a valid request. If the request was unauthorized, it receives a 403 unauthorized
(sometimes 401) response. Your SAS Viya environment’s administrator has complete control over the rules that determine the authorization decision. We can register additional custom applications, like the auto mall’s auction bidding app, to SAS Logon Manager so they too can request scoped tokens. This provides our custom applications with finegrained access to endpoints and content within SAS Viya. But we’ve already hit a fork in the road. We need to understand the purpose of the application we’re looking to build, so we can decide if the access should be userscoped or generalscoped.
In the preowned auto mall’s custom application, a decision and modeling flow is exposed by way of a SAS Micro Analytics Score (MAS) module at a specific HTTP REST API endpoint. All the users of the application have the same level of access to the same endpoint by design; there was no need for differentiated access levels. This is considered an instance of generalscoped access. Abiding by a leastprivilege approach, the app should only access certain data and endpoints in SAS Viya, that is, only what it needs to rather than having free reign. In this case, the appropriate level of access can be granted to the registered client (custom application) itself. The result is an access token from SAS Logon Manager scoped within the context of the application.
In other applications you might find it necessary to make more sensitive, or userspecific resources available. In this instance, you’ll want to make all authorization decisions within the context of the individual loggedin user. For these situations, it is more prudent to redirect the user directly to SAS Logon Manager to log in and then redirect them back to your application. This results in SAS Logon Manager returning an access token scoped within the context of the loggedin user.
Once you’ve decided which of these two scenarios most closely fits your custom application’s needs, the next step is to decide on a flow (or grant) to provide the associated type of access token, general or userscoped. Here “flow” is a term used to describe the process for accessing a protected resource within the OAuth framework and that’s what I’ll cover in the next post in this series.
Building custom apps on top of SAS Viya, Part Two: Understanding the integration points was published on SAS Users.
This post was kindly contributed by SAS Users  go there to comment and to read the full post. 
The post Matrix balancing: Update matrix cells to match row and column sums appeared first on The DO Loop.
]]>This post was kindly contributed by The DO Loop  go there to comment and to read the full post. 
Matrix balancing is an interesting problem that has a long history.
Matrix balancing refers to adjusting the cells of a frequency table to match known values of the row and column sums. One of the early algorithms for matrix balancing is known as the RAS algorithm, but it is also called the raking algorithm in some fields.
The presentation in this article is inspired by a paper by Carol Alderman (1992).
Alderman shows a typical example (p. 83), but let me give a smaller example. Suppose a troop of six Girl Scouts sells seven different types of cookies. At the end of the campaign, each girl reports how many boxes she sold. Also, the troop leader knows how many boxes of each cookie type were sold. However, no one kept track of how many boxes of each type were sold by each girl.
The situation is summarized by the following 7 x 6 table.
We know the row totals, which are the numbers of boxes that were sold for each cookie type.
We also know the column totals, which are the numbers of boxes that were sold by each girl.
But we do not know the values of the individual cells, which are the typebygirl totals.
As stated, the problem usually has infinitely many solutions. For an r x c table, there are r*c cells, but the marginal totals only provide r + c linear constraints. Except for 2 x 2
tables, there are more unknowns than equations, which leads to an underdetermined linear system that (typically) has infinitely many solutions.
Fortunately, often obtain a unique solution if we provide an informed guess for the cells in the table.
Perhaps we can ask the girls to provide estimates of the number of boxes of each type that they sold. Or perhaps we have values for the cells from the previous year. In either case, if we can estimate the cell values, we can use a matrix balancing algorithm to adjust the cells so that they match the observed marginal totals.
Let’s suppose that the troop leader asks the girls to estimate (from memory) how many boxes of each type they sold.
The estimates are shown in the following SAS/IML program.
The row and column sums of the estimates do not match the known marginal sums, but that’s okay.
The program uses subscript reduction operators to compute the marginal sums of the girls’ estimates. These are displayed next to the actual totals, along with the differences.
proc iml; /* Known marginal totals: */ u = {260, 214, 178, 148, 75, 67, 59}; /* total boxes sold, by type */ v = {272 180 152 163 134 100}; /* total boxes sold, by girl */ /* We don't know the cell values that produce these marginal totals, but we can ask each girl to estimate how many boxes of each type she sold */ /* G1 G2 G3 G4 G5 G6 */ A = {75 45 40 40 40 30 , /* C1 */ 40 35 45 35 30 30 , /* C2 */ 40 25 30 40 30 20 , /* C3 */ 40 25 25 20 20 20 , /* C4 */ 30 25 0 10 10 0 , /* C5 */ 20 10 10 10 10 0 , /* C6 */ 20 10 0 10 0 0 }; /* C7 */ /* notice that the row/col totals for A do not match the truth: */ rowTotEst = A[ ,+]; /* for each row, sum across all cols */ colTotEst = A[+, ]; /* for each col, sum down all rows */ Type = 'Cookie1':'Cookie7'; /* row labels */ Person = 'Girl1':'Girl6'; /* column labels */ /* print known totals, estimated totals, and difference */ rowSums = (u  rowTotEst  (urowTotEst)); print rowSums[r=Type c={'rowTot (u)' 'rowTotEst' 'Diff'}]; colSums = (v // colTotEst // (vcolTotEst)); print colSums[r={'colTot (v)' 'colTotEst' 'Diff'} c=Person]; 
You can see that the marginal totals do not match the known row and column totals.
However, we can use the guesses as an initial estimate for the RAS algorithm. The RAS algorithm will adjust the estimates to obtain a new matrix that DOES satisfy the marginal totals.
The RAS algorithm can iteratively adjust the girls’ estimates until the row sums and column sums of the matrix match the known row sums and column sums. The resulting matrix will probably not be an integer matrix.
It will, however, reflect the structure of the girls’ estimates in three ways:
The RAS algorithm is explained in Alderman (1992). The matrix starts with a nonnegative
r x c matrix.
Let u be the observed r x 1 vector of row sums.
Let v be the observed 1 x c vector of column sums.
The main steps of the RAS algorithm are:
The RAS algorithm is implemented in the following SAS/IML program:
/* Use the RAS matrix scaling algorithm to find a matrix X that approximately satisfies X[ ,+] = u (marginals constraints on rows) X[+, ] = v (marginals constraints on cols) and X is obtained by scaling rows and columns of A */ start RAS(A, u, v); X = A; /* Step 0: Initialize X */ converged = 0; do k = 1 to 100 while(^Converged); /* Step 1: Row scaling: X = u # (X / X[ ,+]); */ R = u / X[ ,+]; X = R # X; /* Step 2: Column scaling: X = (X / X[+, ]) # v; */ S = v / X[+, ]; X = X # S; /* check for convergence: u=rowTot and v=colTot */ rowTot = X[ ,+]; /* for each row, sum across all cols */ colTot = X[+, ]; /* for each col, sum down all rows */ maxDiff = max( abs(urowTot), abs(vcolTot) ); Converged = (maxDiff < 1e8); end; return( X ); finish; X = RAS(A, u, v); print X[r=Type c=Person format=7.1]; 
The matrix X has row and column sums that are close to the specified values. The matrix also preserves relationships among individual columns and rows, based on the girls’ recollections of how many boxes they sold.
The algorithm preserves zeros for the girls who claim they sold zero boxes of a certain type. It does not introduce any new zeros for cells.
You can’t determine how close the X matrix is to “the truth,” because we don’t know the true number of boxes that each girl sold.
The X matrix depends on the girls’ recollections and estimates. If the recollections are close to the truth, then X will be close to the truth as well.
This article looks at the RAS algorithm, which is one way to perform matrix balancing on a table that has nonnegative cells.
“Balancing” refers to adjusting the interior cells of a matrix to match observed marginal totals.
The RAS algorithm starts with an estimate, A. The estimate does not need to satisfy the marginal totals,
but it should reflect relationships between the rows and columns that you want to preserve, such as relative proportions and zero cells. The RAS algorithm iteratively scales the rows and columns of A to create a new matrix that matches the observed row and column sums.
It is worth mentioning the row scaling could be accumulated into a diagonal matrix, R,
and the column scaling into a diagonal matrix, S. Thus, the balanced matrix is the product of
these diagonal matrices and the original estimate, A. As a matrix equation, you can write X = RAS. It is this matrix factorization that gives the RAS algorithm its name.
The RAS is not the only algorithm for matrix balancing, but it is the simplest. Another algorithm, called iterative proportional fitting (IPF), will be discussed in a second article.
The post Matrix balancing: Update matrix cells to match row and column sums appeared first on The DO Loop.
This post was kindly contributed by The DO Loop  go there to comment and to read the full post. 
Using localespecific format catalogs to create reports in multiple languages was published on SAS Users.
]]>This post was kindly contributed by SAS Users  go there to comment and to read the full post. 
Localespecific SAS® format catalogs make reporting in multiple languages more dynamic. It is easy to generate reports in different languages when you use both the LOCALE option in the FORMAT procedure and the LOCALE= system option to create these catalogs. If you are not familiar with the LOCALE= system option, see the “Resources” section below for more information.
This blog post, inspired by my work on this topic with a SAS customer, focuses on how to create and use localespecific informats to read in numeric values from a Microsoft Excel file and then transform them into SAS character values. I incorporated this step into a macro that transforms ones and zeroes from the Excel file into meaningful information for multilingual readers.
The first step is to submit the LOCALE= system option with the value fr_FR. For the example in this article, I chose the values fr_FR and en_US for French and English from this table of LOCALE= values. (That is because I know how to say “yes” and “no” in both English and French — I need to travel more!)
options locale=fr_fr;

The following code uses both the INVALUE statement and the LOCALE option in PROC FORMAT to create an informat that is named $PT_SURVEY:
proc format locale library=work; invalue $pt_survey 1='oui' 0='non'; run; 
Now, toggle the LOCALE= system option and create a second informat using labels in a different language (in this example, it is English):
options locale=en_us;
proc format locale library=work; invalue $pt_survey 1='yes' 0='no'; run; 
In the screenshot below, which shows the output from the DATASETS procedure, you can see that PROC FORMAT created two format catalogs using the specified locale values, which are preceded by underscore characters. If the format catalogs already exist, PROC FORMAT simply adds the $PT_SURVEY informat entry type to them.
proc datasets memtype=catalog; quit; 
Before you use these informats for a report, you must tell SAS where the informats are located. To do so, specify /LOCALE after the libref name within the FMTSEARCH= system option. If you do not add the /LOCALE specification, you see an error message stating either that the $PT_SURVEY informat does not exist or that it cannot be found. In the next two OPTIONS statements, SAS searches for the localespecific informat in the FORMATS_FR_FR catalog, which PROC FORMAT created in the WORK library:
options locale=fr_fr; options fmtsearch=(work/locale); 
If you toggle the LOCALE= system option to have the en_US locale value, SAS then searches for the informat in the other catalog that was created, which is the FORMATS_EN_US catalog.
For this example, you can create an Excel file by using the ODS EXCEL destination from the REPORT procedure output. Although you can create the Excel file in various ways, the reason that I chose the ODS EXCEL statement was to show you some options that can be helpful in this scenario and are also useful at other times.
Use the ODS EXCEL destination to create a file from PROC REPORT. I specify the TAGATTR= style attribute using “TYPE:NUMBER” for the Q_1 variable:
%let path=%sysfunc(getoption(WORK)); filename temp "&path\surveys.xlsx"; ods excel file=temp; data one; infile datalines truncover; input ptID Q_1; datalines; 111 0 112 1 ; run; proc report data=one; define ptID / display style(column)={tagattr="type:String"}; define Q_1 / style(column)={tagattr="type:Number"}; run; ods excel close; 
Now you have a file that looks like this screenshot when it is opened in Excel. Note that the data value for the Q_1 column is numeric:
The IMPORT procedure uses the DBSASTYPE= data set option to convert the numeric Excel data into SAS character values. Then I can apply the localespecific character informat to a character variable.
As you will see below, in the macro, I use DBMS=EXCEL in PROC IMPORT to read the Excel file because my SAS and Microsoft Office versions are both 64bit. (You might have to use the PCFILES LIBNAME Engine to connect to Excel through the SAS PC Files Server if you are not set up this way.)
The final step is to run the macro with parameters to produce the two reports in French and English, using the localespecific catalogs. When the macro is called, depending on the parameter value for the macro variable LOCALE, the LOCALE= system option changes, and the $PT_SURVEY informat from the localespecific catalog is applied. These two tabular reports are produced:
Here is the full code for the example:
%let path=%sysfunc(getoption(WORK)); filename temp "&path\surveys.xlsx"; ods excel file=temp; data one; infile datalines truncover; input ptID Q_1; datalines; 111 0 112 1 ; run; proc report data=one; define ptID / display style(column)={tagattr="type:String"}; define Q_1 / style(column)={tagattr="type:Number"}; run; ods excel close; options locale=fr_fr; proc format locale library=work; invalue $pt_survey 1='oui' 0='non'; run; options locale=en_us; proc format locale library=work; invalue $pt_survey 1='yes' 0='no'; run; /* Set the FMTSEARCH option */ options fmtsearch=(work/locale); /* Compile the macro */ %macro survey(locale,out); /* Set the LOCALE system option */ options locale=&locale; /* Import the Excel file */ filename survey "&path\surveys.xlsx"; proc import dbms=excel datafile=survey out=work.&out replace; getnames=yes; dbdsopts="dbsastype=(Q_1='char(8)')"; run; data work.&out; set work.&out; /* Create a new variable for the report whose values are assigned by specifying the localespecific informat in the INPUT function */ newvar=input(Q_1, $pt_survey.); label newvar='Q_1'; run; options missing='0'; /* Create the tabular report */ proc tabulate data=&out; class ptID newvar; table ptID='Patient ID', newvar*n=' '/box="&locale"; run; %mend survey; /* Call the macros */ %survey(fr_fr,fr) %survey(en_us,en) 
For a different example that does not involve an informat, you can create a format in a localespecific catalog to print a data set in both English and Romanian. See Example 19: Creating a LocaleSpecific Format Catalog in the Base SAS® 9.4 Procedures Guide.
For more information about the LOCALE option:
For more information about reading and writing Excel files:
For more information about creating macros and using the macro facility in SAS:
Using localespecific format catalogs to create reports in multiple languages was published on SAS Users.
This post was kindly contributed by SAS Users  go there to comment and to read the full post. 