Welcome again to my SAS Customers weblog collection CAS Motion! – a collection on fundamentals. In my earlier submit CAS-Motion! Easy Frequency Tables – Half 1, I reviewed the way to use the easy.freq CAS motion to generate frequency distributions for a number of columns utilizing the distributed CAS server. On this submit I’ll present you the way to save the outcomes of the freq motion as a SAS information set or a distributed CAS desk.
On this instance, I’ll use the CAS language (CASL) to execute the freq CAS motion. Bear in mind, as an alternative of utilizing CASL, I may execute the identical motion with Python, R and extra with some slight modifications to the syntax for the particular language. Consult with the documentation for syntax in different languages.
Load the demonstration information into reminiscence
I will begin by executing the loadTable motion to load the WARRANTY_CLAIMS_0117.sashdat file from the Samples caslib into reminiscence. By default the Samples caslib needs to be out there in your SAS Viya surroundings. I will load the desk to the Casuser caslib after which I will clear up the CAS desk by renaming and dropping columns to make the desk simpler to make use of. For extra data the way to rename columns take a look at my earlier submit. Lastly I will execute the fetch motion to preview 5 rows.
proc cas; * Specify the enter/output CAS desk *; casTbl = {title = "WARRANTY_CLAIMS", caslib = "casuser"}; * Load the CAS desk into reminiscence *; desk.loadtable / path = "WARRANTY_CLAIMS_0117.sashdat", caslib = "samples", casOut = casTbl + {change=TRUE}; * Rename columns with the labels. Areas changed with underscores *; *Retailer the outcomes of the columnInfo motion in a dictionary *; desk.columnInfo consequence=cr / desk = casTbl; * Loop over the columnInfo consequence desk and create an inventory of dictionaries *; listElementCounter = 0; do columnMetadata over cr.ColumnInfo; listElementCounter = listElementCounter + 1; convertColLabel = tranwrd(columnMetadata['Label'],' ','_'); renameColumns[listElementCounter] = {title = columnMetadata['Column'], rename = convertColLabel, label=""}; finish; * Rename columns *; keepColumns = {'Campaign_Type', 'Platform','Trim_Level','Make','Model_Year','Engine_Model', 'Vehicle_Assembly_Plant','Claim_Repair_Start_Date', 'Claim_Repair_End_Date'}; desk.alterTable / title = casTbl['Name'], caslib = casTbl['caslib'], columns=renameColumns, hold = keepColumns; * Preview CAS desk *; desk.fetch / desk = casTbl, to = 5; give up; |
The outcomes above present a preview of the warranty_claims CAS desk.
One Method Frequency for A number of Columns
Subsequent, I will execute the freq motion to generate a frequency distribution for a number of columns.
proc cas; casTbl = {title = "WARRANTY_CLAIMS", caslib = "casuser"}; colNames = {'Model_Year', 'Vehicle_Assembly_Plant', {title = 'Claim_Repair_Start_Date', format = 'yyq.'} }; easy.freq / desk= casTbl, inputs = colNames; give up; |
The freq CAS motion returns the frequency distribution of every column in a single consequence. Whereas that is nice, what if you wish to create a visualization with the info? Or proceed processing the summarized information? How do you save this as a desk? Properly, you have got a couple of choices.
Save the outcomes as a SAS information set
First, it can save you the outcomes of a CAS motion as a SAS information set. The thought right here is the CAS motion will course of the info within the distributed CAS server, after which the CAS server returns smaller, summarized outcomes to the consumer (SAS Studio). The summarized outcomes can then be saved as a SAS information set.
To avoid wasting the outcomes of a CAS motion merely add the consequence possibility after the motion with a variable title. The outcomes of an motion return a dictionary to the consumer and retailer it within the specified variable. For instance, to avoid wasting the outcomes of the freq motion as a SAS information set full the next steps:
- Execute the identical CASL code from above, however this time specify the consequence possibility with a variable title to retailer the outcomes of the freq motion. Right here i will save the leads to the variable freq_cr.
- Use the DESCRIBE assertion to view the construction and information sort of the CASL variable freq_cr within the log (not required).
- Use the SAVERESULT assertion to avoid wasting the CAS motion consequence desk from the dictionary freq_cr as a SAS information set named warranty_freq. To do that specify the important thing Frequency that’s saved within the dictionary freq_cr to acquire the consequence desk.
proc cas; * Reference the CAS desk *; casTbl = {title = "WARRANTY_CLAIMS", caslib = "casuser"}; * Specify the columns to investigate *; colNames = {'Model_Year', 'Vehicle_Assembly_Plant', {title = 'Claim_Repair_Start_Date', format = 'yyq.'} }; * 1. Analyze the CAS desk and retailer the outcomes *; easy.freq consequence = freq_cr / desk= casTbl, inputs = colNames; * 2. View the dictionary within the log *; describe freq_cr; * 3. Save the consequence desk as a SAS information set *; saveresult freq_cr['Frequency'] dataout=work.warranty_freq; give up; |
Within the log, the outcomes of the DESCRIBE assertion reveals the variable freq_cr is a dictionary with one entry. It comprises the important thing Frequency and the worth is a consequence desk. The desk comprises 22 rows and 6 columns. The NOTE within the log reveals the SAVERESULT assertion saved the consequence desk from the dictionary as a SAS information set named warranty_freq within the work library.
As soon as the summarized outcomes are saved in a SAS library, use your conventional SAS programming information to course of the SAS desk. For instance, now I can visualize the summarized information utilizing the SGPLOT process.
* Plot the SAS information set *; title justify=left peak=16pt "Whole Guarantee Claims by 12 months"; proc sgplot information=work.warranty_freq noborder; the place Column = 'Model_Year'; vbar Charvar / response = Frequency nooutline; xaxis show=(nolabel); label Frequency = 'Whole Claims'; format Frequency comma16.; give up; |
Save the Outcomes as a CAS Desk
As a substitute of saving the summarized outcomes as a SAS information set, you may create a brand new CAS desk on the CAS server. To try this all you want is so as to add the casOut parameter within the motion. Right here I will save the outcomes of the freq CAS motion to a CAS desk named warranty_freq within the Casuser caslib, and I’ll give the desk a descriptive label.
proc cas; * Reference the CAS desk *; casTbl = {title = "WARRANTY_CLAIMS", caslib = "casuser"}; * Specify the columns to investigate *; colNames = {'Model_Year', 'Vehicle_Assembly_Plant', {title = 'Claim_Repair_Start_Date', format = 'yyq.'} }; * Analyze the CAS desk and create a brand new CAS desk *; easy.freq / desk= casTbl, inputs = colNames, casOut = { title = 'warranty_freq', caslib = 'casuser', label = 'Frequency evaluation by yr, meeting plant and restore date by quarter' }; give up; |
The outcomes above present the freq motion returned details about the newly created CAS desk. After getting a CAS desk within the distributed CAS server you may proceed working with it utilizing CAS, or you may visualize the info like we did earlier than utilizing SGPLOT. The important thing idea right here is the SGPLOT process doesn’t visualize information on the CAS server. The SGPLOT process returns the complete CAS desk again to SAS (compute server) as a SAS information set, then the visualization happens on the consumer. This implies if the CAS desk is massive, an error or sluggish processing may happen. Nonetheless, in our situation we created a smaller summarized CAS desk, so sending 22 rows again to the consumer (compute server) is not going to be a problem.
* Make a library reference to a Caslib *; libname casuser cas caslib='casuser'; * Plot the SAS information set *; title justify=left peak=16pt "Whole Guarantee Claims by 12 months"; proc sgplot information=casuser.warranty_freq noborder; the place _Column_ = 'Model_Year'; vbar _Charvar_ / response = _Frequency_ nooutline; xaxis show=(nolabel); label _Frequency_ = 'Whole Claims'; format _Frequency_ comma16.; give up; |
Abstract
Utilizing the freq CAS motion allows you to generate a frequency distribution for a number of columns and allows you to save the outcomes as a SAS information set or a CAS desk. They keys to this course of are:
- CAS actions execute on the distributed CAS server and return summarized outcomes again to the consumer as a dictionary. You may retailer the dictionary utilizing the consequence possibility.
- Utilizing dictionary manipulation strategies and the SAVERESULT assertion it can save you the summarized consequence desk from the dictionary as a SAS information set. After getting the SAS information set you should utilize your entire acquainted SAS programming information on the normal compute server.
- Utilizing the casOut parameter in a CAS motion allows you to save the summarized leads to the distributed CAS server.
- The SGPLOT process doesn’t execute in CAS. In the event you specify a CAS desk within the SGPLOT process, the complete CAS desk will probably be despatched again to SAS compute server for processing. This may trigger an error or sluggish processing on massive tables.
- Finest follow is to summarize massive information within the CAS server, after which work with the summarized outcomes on the compute server.
Further sources
freq motion
DESCRIBE assertion
SAVERESULT assertion
Plotting a Cloud Analytic Companies (CAS) In-Reminiscence Desk
SAS® Cloud Analytic Companies: CASL Programmer’s Information
SAS® Cloud Analytic Companies: Fundamentals
CAS Motion! – a collection on fundamentals
Getting Began with Python Integration to SAS® Viya® – Index