Sunday, February 28, 2016

Display all categories on CRF even there is no count

Very often in a statistical summary table for a variable (such as race, age group), we will need to display all categories listed on the case report form even through there may be no subject falling into one of the categories.

See paper by Phillips and Klein "Oh No, a Zero Row: 5 Ways to Summarize Absolutely Nothing" for discussions of various ways in handling this.

SPARSE option in PROC FREQ for filling the data set

The SPARSE Option The SPARSE option in PROC FREQ is not properly named. It is neither meager nor thin in its ability. It is a very powerful option in the table statement. Simply stated, the SPARSE option provides “all possible combinations of levels of the variables in the table, even when some combination levels do not occur in the data.”1 This is a huge help when trying to zero-fill a data set. Although current versions of SAS® now contain the new PRELOADFMT option for TABULATE and REPORT, for some types of reports, this new option will not work.

See the article by Chris Moriak

Monday, February 1, 2016

How to find a character or word from a free text variable?

It is often difficult to summarize the information from a free text variable. If we do, it may require to identify / find the character or word using a SAS function.

SAS function Index() can be used for this purpose. The following statement will identify any records with the word 'PNEUMONIA' contained in the VARNAME variable.

       where index(varname, 'PNEUMONIA')>=1;

if the word we need to find has mixed lower and upper cases, we can use the upcase() function.

       where index(upcase(varname), 'PNEUMONIA')>=1;

Reference:
How can I find things in a character variable in SAS?

Friday, January 15, 2016

Calculating the difference between the last value and the first value using RETAIN statement

The program below demonstrates the use of retain statement to calculate the difference between the last value and the first value for each id. 

data x1;
       input id days score;
 datalines;
1 1 23
1 2 34
2 1 45
2 2 46
3 1 35
3 2 40
;

proc sort; 
    by id;
run;

data new;
     set x1;
     by id;   
      if first.id then do;
           score1=score;
           retain score1;
        end;
   diff=score-score1;
   if last.id then output;
run;

proc print;
run;



Monday, January 4, 2016

Forest Plot of Hazard Ratios by Patient Subgroups using SAS






















The original paper can be found here: 
Base SAS: GTL Template Language
SAS 9.4 GTL feature: Value statement's TextAttrs option

%let graphs='.';

%let dpi=100;
%let w=8in;
%let h=4.5in;

/*--Leading blanks in the subgroup variable must be non--blank spaces  --*/
/*--Use character value 'A0', or copy from Windows System Character Map--*/
/*--Regular leading blanks will be stripped, losing the indentation    --*/
data forest;
  input Indent Subgroup $3-27 Count Percent Mean  Low  High  PCIGroup Group PValue;
  format PCIGroup Group PValue 7.2;
  zero=0; 
  if count ne . then CountPct=put(count, 4.0) || "(" || put(percent, 3.0) || ")";
  datalines;
0 Overall..................2166  100  1.3   0.9   1.5  17.2  15.6  .
0 Age.......................     .    .     .     .    .     .     0.05
2 <= 65 Yr.................1534   71  1.5   1.05  1.9  17.0  13.2   .
2 > 65 Yr.................. 632   29  0.8   0.6   1.25 17.8  21.3   .
0 Sex.......................     .    .     .     .    .     .     0.13
2 Male.....................1690   78  1.5   1.05  1.9  16.8  13.5   .
2 Female................... 476   22  0.8   0.6   1.3  18.3  22.9   . 
0 Race or ethnic group......     .    .     .     .    .     .     0.52
2 Nonwhite................. 428   20  1.05  0.6   1.8  18.8  17.8   .
2 White....................1738   80  1.2   0.6   1.6  16.7  15.0   . 
0 From MI to Randomization..     .    .     .     .    .     .     0.81
2 <= 7 days................ 963   44  1.2   0.8   1.5  18.9  18.6   .
2 > 7 days.................1203   56  1.15  0.75  1.5  15.9  12.9   .
0 Infract-related artery....     .    .     .     .    .     .     0.38
2 LAD...................... 781   36  1.4   0.9   1.9  20.1  16.2   .
2 Other....................1385   64  1.1   0.8   1.4  15.6  15.3   . 
0 Ejection Fraction.........     .    .     .     .    .     .     0.48
2 < 50%....................1151   54  1.2   0.8   1.5  22.6  20.4   .
2 >= 50%................... 999   46  0.9   0.6   1.4  10.7  11.1   . 
0 Diabetes..................     .    .     .     .    .     .     0.41
2 Yes...................... 446   21  1.4   0.9   2.0  29.3  23.3   .
2 No.......................1720   79  1.1   0.8   1.5  14.4  13.5   . 
0 Killip class..............     .    .     .     .    .     .     0.39
2 I........................1740   81  1.2   0.8   1.6  15.2  13.1   .
2 II-IV.................... 413   19  0.95  0.6   1.5  25.3  26.9   . 
;
run;

/*--Replace '.' in subgroup with blank--*/
data forest2;
  set forest;
  subgroup=translate(subgroup, ' ', '.');
  val=mod(_N_-1, 6);
  indent=ifn(indent eq 2, 1, 0);
  if val eq 1 or val eq 2 or val eq 3 then ref=subgroup;
  run;

/*--Create font with smaller fonts for axis label, value and data--*/
proc template;
  define style listingSF; 
    parent = Styles.Listing; 
    style GraphFonts from GraphFonts                                                      
      "Fonts used in graph styles" /                                       
      'GraphDataFont' = (", ",7pt)                                
      'GraphValueFont' = (", ",6pt)
      'GraphLabelFont' = (", ",6pt, bold); 
  end;
run;

/*--Define templage for Forest Plot--*/
/*--Template uses a Layout Lattice of 6 columns--*/
proc template;
  define statgraph Forest;
  dynamic _show_bands _color _thk;
    begingraph;
      entrytitle 'Forest Plot of Hazard Ratios by Patient Subgroups ';
      discreteattrmap name='text';
        value '0' / textattrs=(weight=bold);
        value other;
      enddiscreteattrmap;
      discreteattrvar attrvar=type var=indent attrmap='text';

      layout lattice / columns=6 columnweights=(0.25 0.1 0.4 0.08 0.08 0.09);

      /*--Column headers--*/
      sidebar / align=top;
        layout lattice / rows=2 columns=4 columnweights=(0.2 0.25 0.25 0.3);
          entry textattrs=(size=8) halign=left "Subgroup";
          entry textattrs=(size=8) halign=left " No.of Patients (%)";
          entry textattrs=(size=8) halign=left "Hazard Ratio";
          entry halign=center textattrs=(size=8) "4-Yr Cumulative Event Rate" ;
          entry " "; 
          entry " "; 
          entry " "; 
          entry halign=center textattrs=(size=8) "Medical Therapy";
        endlayout;
      endsidebar;

      /*--First Subgroup column, shows only the Y2 axis--*/
      layout overlay / walldisplay=none xaxisopts=(display=none) 
          yaxisopts=(reverse=true display=none 
                     tickvalueattrs=(weight=bold));
        referenceline y=ref / lineattrs=(thickness=_thk color=_color);
        axistable y=subgroup value=subgroup / indentweight=indent textgroup=type;
       endlayout;

       /*--Second column showing Count and percent--*/
       layout overlay / xaxisopts=(display=none) 
            yaxisopts=(reverse=true display=none) walldisplay=none;
         referenceline y=ref / lineattrs=(thickness=_thk color=_color);
         axistable y=subgroup value=countpct;
       endlayout;

       /*--Third column showing odds ratio graph--*/
       layout overlay / xaxisopts=(label='   <---PCI Better----  ----Medical Therapy Better--->'  
           linearopts=(tickvaluepriority=true 
                       tickvaluelist=(0.0 0.5 1.0 1.5 2.0 2.5)))
           yaxisopts=(reverse=true display=none) walldisplay=none;
         referenceline y=ref / lineattrs=(thickness=_thk color=_color);
         scatterplot y=subgroup x=mean / xerrorlower=low xerrorupper=high 
           markerattrs=(symbol=squarefilled);
         referenceline x=1;
       endlayout;

       /*--Fourth column showing PCIGroup--*/
       layout overlay / x2axisopts=(display=(tickvalues) offsetmin=0.25 offsetmax=0.25) 
            yaxisopts=(reverse=true display=none) walldisplay=none;
         referenceline y=ref / lineattrs=(thickness=_thk color=_color);
         axistable y=subgroup value=PCIGroup / display=(label) labelposition=max;
       endlayout;

       /*--Fifth column showing Group--*/
       layout overlay / x2axisopts=(display=(tickvalues) offsetmin=0.25 offsetmax=0.25) 
            yaxisopts=(reverse=true display=none) walldisplay=none;
         referenceline y=ref / lineattrs=(thickness=_thk color=_color);
         axistable y=subgroup value=group / display=(label) labelposition=max;
       endlayout;

       /*--Sixth column showing P-Values--*/
       layout overlay / x2axisopts=(display=(tickvalues) offsetmin=0.25 offsetmax=0.25) 
           yaxisopts=(reverse=true display=none) walldisplay=none;
         referenceline y=ref / lineattrs=(thickness=_thk color=_color);
         axistable y=subgroup value=pvalue / display=(label) labelposition=max;
       endlayout;

     endlayout;
     entryfootnote halign=left textattrs=(size=7) 
       'The p-value is from the test statistic for testing the interaction between the '
       'treatment and any subgroup variable';
     entryfootnote halign=left 'This graph uses the new AXISTABLE plot to display the textual columns';
   endgraph;
  end;
run;

/*--Need format to show missing as blank--*/
proc format;
  value misblank
    . = ' ';
run;

/*----Create Graph-----*/
ods listing style=htmlblue gpath=&graphs image_dpi=&dpi;
ods graphics / reset noscale width=&w height=&h imagename='GTL_ForestPlot';
proc sgrender data=Forest2 template=Forest;
format pvalue group pcigroup misblank7.2;
dynamic _color='cxf0f0f0' _thk=12;
run;

Friday, January 1, 2016

Dealing with SAS format catalog

libname library "the directory where formats catalog is stored";
libname out "the directory where the transformed format is stored";
*** To simply read a format catalog;
proc format library=library.formats fmtlib;
run;

*** To copy a existing format from one area to another and add some more formats;
*** to read a format data set;
proc format library=library.formats fmtlib cntlin=out.formats;
  format ...;
run;

*** to read a catalog;
proc format library=library.formats fmtlib cntlin=out.formats
memtype=catalog;
run;

*** to print out a catalog;
proc format library=library.formats cntlout=out.formats;
run;

proc print data=out.formats;
run;



*** to convert the SAS catalog into SAS data set;
libname dest 'c:\temp\test';

options nofmterr;
proc format library=dest.formats fmtlib cntlout=dest.formats;
run;

***For more information, see the following link;
http://www.ats.ucla.edu/stat/sas/library/formats.htm