- “Information is centralized. Action is decentralized.”
- “Data science teams need to be run by data scientists.”
- “Look for ways to marry different data sources.”
- "Data analytics means being customer focused.”

```
PROC SETINIT;
RUN;
```

P.S. if you are in in a special interest group (e.g. the government) then include the ```
PROC SETINIT NOALIAS;
RUN;
```

In your log you will see a listing of all the products. Here’s my log. You’ll see a very long list of SAS products in my log. Because I work for SAS as a trainer, I need to have many more SAS products installed so that I can help different customers with their varying SAS questions.
If you want more and are a super user, you may like this macro. I love it, it creates a bunch of datasets in the work library that show you products licensed, products installed, and any hot fixes applied and much, much more. Did we say this was free too? Go here to download this macro and then run it to watch the magic happen. For more details check out this great blog post by Larry LaRusso.
**SAS/GRAPH**software is a component of the SAS System, an applications system for data access, management, analysis, and presentation. SAS/GRAPH Users can customize graphs with the software, and present multiple graphs on a page. It includes procedures like PROC GCHART.**SAS/Access****to PC files**allows you to read PC software files such as Microsoft Excel workbooks and Microsoft Access databases and treat them as if they are a native SAS dataset.**SAS/STAT**meets both specialized and enterprise-wide statistical needs from traditional analysis of variance and linear regression to Bayesian inference and high-performance modeling tools for massive data.

- Will your sales people forecast low during quota setting time, to make it easier to achieve their bonuses?
- Will a product manager forecast high for a proposed new product, to make sure it meets the hurdles for approval?

- Determine what level of accuracy is reasonable to expect given the nature of your demand patterns.

- Direct all efforts toward achieving that level of accuracy with the least cost in time and company resources.

- Automate, automate, automate wherever possible.

- Do not squander organizational resources in pursuit of unrealistic accuracy goals.

*Create patterns of missing data in #SAS*

Click To Tweet

The following SAS/IML program reads numerical data into a matrix from the Sashelp.Class data set. The matrix has 16 rows and three columns. The program then generates a matrix of the same size that contains a random pattern of zeros and ones, where about 40% of the values will be ones. The LOC function is used to find the locations of the ones, and the corresponding locations in the data are set to missing:

```
proc iml;
use Sashelp.Class; /* read numeric data into X */
read all var _NUM_ into X;
close;
/* random assignment of missing values */
RandX = X; /* copy data */
p = 0.4; /* approx proportion of missing elements */
call randseed(1234);
B = randfun(dimension(X), "bern", p); /* random 0s or 1s */
missIdx = loc(B=1); /* find position of 1s */
if ncol(missIdx)>0 then
RandX[missIdx] = .; /* replace 1s with missing */
print RandX;
```

In this way, you can replace a certain percentage of the data values with missing values.

In the preceding section, the technique for inserting missing values does not use the fact that the matrix **B** is random. The technique works with any zero-one matrix **B** that specifies a pattern of missing values. For example, you can create a matrix that contains all combinations of zeros and ones, then use that pattern to set missing values, as follows:

```
C = { 0 0 0,
0 0 1,
0 1 0,
0 1 1,
1 0 0,
1 0 1,
1 1 0,
1 1 1 }; /* pattern matrix */
missIdx = loc(C=1);
SysX = X; /* copy data */
if ncol(missIdx)>0 then
SysX[missIdx] = .; /* replace 1s with missing */
print SysX;
```

You could also specify the locations of the missing values by using subscripts of the data matrix. You can use the SUB2NDX function to convert subscripts to indices.

In the SAS DATA step you can use arrays to create a random pattern of missing values. For example, the following SAS data set reads numerical variables from the Sashelp.Class data and randomly assigns 40% of the data to missing values:

```
/* generate missing values in random locations */
data RandClass(drop=i);
call streaminit(1234);
set Sashelp.Class(keep=_NUMERIC_);
array x {*} _numeric_;
do i = 1 to dim(x);
if rand("Bern", 0.4) then /* p=0.4 ==> about 40% missing */
x[i]=.;
end;
run;
proc print; run;
```

The output is not shown, but the random pattern is identical to the random pattern that was generated by using SAS/IML matrices.

You could use the DATA step to specify patterns of missing values for which there is a formula, such as every fourth data value (MOD(cnt,4)=1). However, it is less easy to generate an arbitrary pattern, such as the "all combinations" pattern in the previous section. In general, I think the SAS/IML approach is easier to use and more flexible.

For any pattern of missing values, you can use PROC MI to summarize the pattern. You can also use various graphical techniques to visualize the pattern of missing data.

]]>was able to identify about 25 products that, when analyzed together, allowed him to assign each shopper a “pregnancy prediction” score. More important, he could also estimate her due date to within a small window, so Target could send coupons timed to very specific stages of her pregnancy.

And the company did just that. As the story goes, Target knew that a 16-year-old woman was pregnant before her own father did. To be sure, few stories illustrate the power of data and analytics as vividly as Duhigg's account. By no accounts did Target do anything illegal. Its management simply wanted to market products of interest to its customers as effectively as possible. When viewed through the lens of a cut-throat retail environment, actions such as these aren't simply optional; they're usually required, especially for publicly traded companies. Still, it's no overstatement to call the Target story aRead: Big data privacy: Four ways your data governance strategy affects privacy, security and trust

]]>Control Limits |
Specification Limits |

Calculated from data | Given by customer or design |

Based on variability | Based on system requirements |

Applied to summary statistics | Applied to individual measurements |

Applied to process measurements, perhaps on product, perhaps not | Applied to product characteristics |

The voice of the process | The voice of the customer |

*CV = Standard Deviation / Mean*

*title "Tumor Response by Week";*
*ods graphics / reset width=5in height=3in imagename='Spider';*
*proc sgplot data=spider noborder tmplout='c:\spider.sas';*
* format tgroup $growth.;*
* symbolchar name=ongoing char='2192'x / scale=1;*
* symbolchar name=growtht char='2020'x / scale=1; *
* symbolchar name=growthnt char='2021'x / scale=1; *
* styleattrs datacontrastcolors=(green gold red)*
* datasymbols=(ongoing growtht growthnt );*
* refline 0 / lineattrs=(pattern=shortdash);*
* series x=week y=change / group=subject grouplc=rgroup groupmc=rgroup*
* markers markerattrs=(symbol=circlefilled) *
* lineattrs=(thickness=2 pattern=solid) name='a';*
* scatter x=weekS y=change / group=TGroup markerattrs=(size=16 color=black)*
* nomissinggroup name='b';*
* keylegend 'a' / title='Response' type=linecolor valueattrs=(size=7) *
* location=inside position=topright across=1 opaque;*
* keylegend 'b' / valueattrs=(size=7) noborder;*
* xaxis label='Week';*
*run;*