Thursday, December 24, 2015

Calculating AUC using cubic spline interpolation or trapezoid rule

    This program uses PROC EXPAND to calculate the approximate area under
    the curve for some sample data.  The sample data should consist of
    (x,y) pairs.

    For this example, the sample data is generated from a high degree
    polynomial. PROC EXPAND is then used to compute the approximate area
    under the curve using each of the following methods:

       a.  Cubic Spline interpolation.
       b.  Trapezoid rule.

    The exact area, given by the definite integral, is calculated
    for the polynomial curve in order to assess the precision of the


%let lower=-2;
%let upper=1;
%let interval=0.2;

* generate some data according to a high order polynomial;
data kvm;
 do x=&lower to &upper by &interval;

proc sort;
 by x;

/* PROC EXPAND will include a contribution for the last interval.  For
   an accurate approximation to the integral, we need to make sure that
   this last contribution is negligible.  So we'll append an additional
   x value which is extremely close to the last x value.  Of course, the
   two Y values will be identical.  But the result is that the last
   interval is extremely short, so any contribution to the integral
   approximation is negligible.

data one;
 set kvm end=eof;
 if eof then do;

proc print data=one(obs=50);
 title 'First few observations of the original data';

proc gplot data=one;
 title 'original series';
 plot y*x;

proc expand data=one out=three method=spline ;
 convert y=total/observed=(beginning,total) transformout=(sum);
 id x;

No comments:

Post a Comment