Special Offer
Advanced Excel VBA prorgrams in finance & statistics.

- open source code -

Include Monte Carlo Simulation, Multiple Regression, Bootstrap, Random Numbers Generator and many more!


Risk Analyzer
The easiest to use decision and risk analysis program that one can find. Completely menu driven.


Visual Basic Collection
Learn to write Excel macros and get over 1200 macros.









Excel VBA Simulation Basic Tutorial 102


This page is the second part of the Excel VBA Simulation Basic Tutorial series.  It provides Excel VBA tutorials on how to create statistic estimates that are used to analyze the data from a simulation.  Many of the examples used are already available in Excel functions.  Users can use these Excel functions as tools to check against the results that come from the examples.  These examples require basic programming skills in VBA.  Users are encouraged to read Simulation Based Tutorial 101 if they have problem understanding the programming concepts and terms used on this page.

This document contains information about the following topics.




Random Number and Randomize Statement

To generate random number from 0 to 1 uniformly, one can use the Rand() function in Excel or the Rnd function in VBA.  These two functions are the mother of all random numbers.  You will need either one of these functions to generate random numbers from any probability distributions.

The following example generate 5 random numbers and then display them in a message box:

Sub rndNo()
    Dim str As String
   
    For i = 1 To 5
        str = str & CStr(Rnd) & vbCrLf
    Next i
   
    MsgBox str   
End Sub

* CStr() function converts the random numbers into string.





So far so good.  But when we close the file, reopen it, and run the sub routine again, the same 5 numbers come up! 

The reason why this happens is that the random numbers were actually being generated from the same set of numbers (called seed).  By placing the Randomize statement in the sub routine, the numbers will be generated from a new seed.  (Randomize uses the return value from the Timer function as the new seed value.) 

The new routine can be as followed:

Sub rndNo()
    Dim str As String
   
    Randomize
    For i = 1 To 5
        str = str & CStr(Rnd) & vbCrLf
    Next i
   
    MsgBox str   
End Sub

Sometimes we might want to use the same seed over and over again by just changing the values of certain variables in our simulations to see how the change affects the outcomes.  In such case, omit the Randomize statement in your sub routine.


For more information, refer to Excel VBA Help in your Excel program.





Standard Deviation and Mean

Standard deviaiton and mean are the two mostly used statistic estimates of all times.  Mean is the average.  Standard deviation measures the 'spreadness' of the distribution.
 



The following are functions that compute mean and standard deviation.  These functions are similar to other  functions used in our examples; they take array as their arguments.

Function Mean(Arr() As Single)
     Dim Sum As Single
     Dim i As Integer

    Sum = 0
    For i = 1 To UBound(Arr)
        Sum = Sum + Arr(i)
    Next i
 
    Mean = Sum / UBound(Arr)
End Function


Function StdDev(Arr() As Single)
     Dim i As Integer
     Dim avg As Single, SumSq As Single
 
     avg = Mean(Arr)
     For i = 1 To UBound(Arr)
          SumSq = SumSq + (Arr(i) - avg) ^ 2
     Next i
 
     StdDev = Sqr(SumSq / (UBound(Arr) - 1))
End Function

The following sub routine reads the data in column one from row 1 to 10 (of Sheet1) into the array, calls both functions by passing the arguements to them, computes the mean (average) and the standard deviation, then returns the values in a message box.

Sub compute()
     Dim Arr(10) As Single
     Dim Average As Single
     Dim Std_Dev As Single

     For i = 1 To UBound(Arr)
          Arr(i) = Sheets("Sheet1").Cells(i, 1)
     Next i

     Average = Mean(Arr)
     Std_Dev = StdDev(Arr)
     MsgBox "Average:" & vbTab & Average & vbCrLf & "StdDev :" & vbTab & Std_Dev
End Sub

The figures below show the data and the result.

                                 




Similar example is also used in the Standard Deviation and Mean examples on the VBA section.


(These functions are similar to the AVERAGE() and the STDEV() functions provided by Excel.)    




Skewness and Kurtosis

Skewness measures the degree of asymmetry of a distribution.  For example, the skewness of a normal distribution is 0 since a normal distribution is symmetric.  Positive skewness indicates a distribution with an asymmetric tail extending toward more positive values, where as negative skewness extending toward more negative values.




Kurtosis measures the degree of peakedness or flatness of a distribution compared with normal distribution.  Positive kurtosis indicates a relatively peaked distribution.  Negative kurtosis indicates a relatively flat distribution. 





Often, these two estimates along with mean and standard deviation are used to test to see if the simulated data from a distribution is sound (if the data represents the distribution).

The following sub routine, compute(), reads the following data in column one from row 1 to 10 (of the active sheet) into the array,



calls both functions by passing the arguements, computes the four moments (namely mean, standard deviation, skewness, and kurt) and returns the values in a message box.

Sub compute()
    Dim arr(10) As Single
   
    For i = 1 To 10
        arr(i) = Cells(i, 1)
    Next i
    
    MsgBox "Mean:" & vbTab & Format(Mean(arr), "0.0000") & vbCrLf & _
    "SD:" & vbTab & Format(Var(arr) ^ 0.5, "0.0000") & vbCrLf & _
    "Skew:" & vbTab & Format(Skew(arr), "0.0000") & vbCrLf & _
    "Kurt:" & vbTab & Format(Kurtosis(arr), "0.0000")   
End Sub

Function Skew(arr() As Single)
    Dim i As Long, n As Long
    Dim avg As Single, sd As Single, SumTo3 As Single
   
    n = UBound(arr)
    avg = Mean(arr)
    sd = (Var(arr)) ^ 0.5
   
    SumTo3 = 0
    For i = 1 To n
        SumTo3 = SumTo3 + ((arr(i) - avg) / sd) ^ 3
    Next i
   
    Skew = SumTo3 * (n / ((n - 1) * (n - 2))) 
End Function

Function Kurtosis(arr() As Single)
    Dim i As Long, n As Long
    Dim avg As Single, sd As Single, SumTo3 As Single
   
    n = UBound(arr)
    avg = Mean(arr)
    sd = (Var(arr)) ^ 0.5
   
    SumTo4 = 0
    For i = 1 To n
        SumTo4 = SumTo4 + ((arr(i) - avg) / sd) ^ 4
    Next i
   
    Kurtosis = SumTo4 * (n * (n + 1) / ((n - 1) * (n - 2) * (n - 3))) - (3 * (n - 1) ^ 2 / ((n - 2) * (n - 3)))
End Function

Function Mean(arr() As Single)
    Dim Sum As Single
    Dim i As Long, k As Long
   
    k = UBound(arr)
    Sum = 0
    For i = 1 To k
        Sum = Sum + arr(i)
    Next i
   
    Mean = Sum / k   
End Function

Function Var(arr() As Single)
    Dim i As Long
    Dim avg As Single, SumSq As Single
   
    k = UBound(arr)
    avg = Mean(arr)
    For i = 1 To k
        SumSq = SumSq + (arr(i) - avg) ^ 2
    Next i
   
    Var = SumSq / (k - 1)   
End Function


The figures below show the data and the result.


                         



(These functions are similar to the SKEW() and the KURT() functions provided by Excel.)  





Percentile and Confidence Interval

Percentile returns the k-th percentile of values in a range.  A confidence interval is the interval between two percentiles.  For example: if a set of data has 20 numbers ranging from 2.5 to 50 with an increment of 2.5 (2.5, 5, ...., 50), the 80th percentile would be 40.  This means that 80% of the elements from the set will be equal to or below than 40.  If the alpha value is 10%, for a two tails test, the lower percentile should be set to 5% (alpha/2) and the upper percentile should be set to 95% (1 - alpha/2).

In order to get the percentile, the data needs to be sorted.  In the sub routine (GetPercentile()) below, 10 random numbers between 1 to 50 are assigned to an array.  The sub routine calls the percertile function (u_percentile()).  The function calls the Sort sub routine to sort the array.  The function gets the value from the array based on the percentile (40%), and returns the percentile value back to the sub routine. 

Notice that Application.Max(Application.Min(Int(k * n), n), 1) in the percentile function makes sure that first, the array index is an integer and second, the maximum value and the minimum value for the array index will not excess the number of elements in the data set or below 1, respectively.

The data and the result are as followed:




The numbers in blue are below the 40% percentile.  Nineteen (19), in this case, is the value that the function returns at 40% percentile.

Here is the complete program for the above example:

Sub GetPercentile()
    Dim arr(10) As Single
   
    For i = 1 To 10
        arr(i) = Int(Rnd * 50) + 1
        Cells(i, 1) = arr(i)
    Next i

    Cells(10, 2) = u_percentile(arr, 0.4)
End Sub

Function u_percentile(arr() As Single, k As Single)
    Dim i As Integer, n As Integer
   
    n = UBound(arr)   
    Call Sort(arr)
    x = Application.Max(Application.Min(Int(k * n), n), 1)
    u_percentile = arr(x)   
End Function

Sub Sort(ByRef arr() As Single)
    Dim Temp As Single
    Dim i As Long
    Dim j As Long
      
    For j = 2 To UBound(arr)
        Temp = arr(j)
        For i = j - 1 To 1 Step -1
            If (arr(i) <= Temp) Then GoTo 10
            arr(i + 1) = arr(i)
        Next i
        i = 0
10      arr(i + 1) = Temp

        If j Mod 100 = 0 Then
            Cells(26, 5) = j
        End If
    Next j
End Sub



Similar concept from this tutorial is used in many of our simulation examples.


(This function is similar to the PERCENTILE() and the QUARTILE() functions provided by Excel.)





Profitablity

The previous percentile example shows how to get the value that corresponds to a specific percentile. In this example, we will show you on how to get the percentile with a given value.

We are going to start this tutorial by showing you a very simple simulation.  However, simulation is not necessary to get the answer in this example because we are using very loss assumptions.  The result can actually be computed in your head if your math is that good.

Assume your profit is distributed uniformly.  From the past records, you know that your annual average profit flucturates between -$100,000 to $500,000.  We want to know what is the probabilty that you will be making over $300,000 next year holding all other things constant.  Interesting enough?  Now watch this:

            1 - (300,000-(-100,000))/(500,000-(-100,000)) = 1 - 0.666 = 0.333

The probabilty will be 33%.

Now, let's run the simulation and see what will happen.

Five simulations were ran, each with 1000 iterations.  The result shows 5 probability values in a message box.  Each result is closed to the mathematic computation of 33%.





Here is the sub routine that runs the simulation:

Sub GetProb()
    Dim high As Single, low As Single, profit As Single
    Dim counter As Integer
    Dim str As String
   
    high = 500000
    low = -100000
    profit = 300000
   
    srt = ""
    For j = 1 To 5
        counter = 0
        For i = 1 To 1000
            If profit <= Rnd * (high - low + 1) + low Then
                counter = counter + 1
            End If
        Next i
        str = str & counter / 1000 & vbCrLf
    Next j
   
    MsgBox str
End Sub


This example is also implemented in the Monte Carlo Simulation tutorial.


(This function is similar to the PERCENTRANK()
function provided by Excel.)  





Creating a Histogram

A histogram from a simulation shows the graphical representation of the derived probability distribution. 

The following sub procedure is an improved model for generating a histogram.  The first parameter, M, is the number of bins (breaks) that you want to have for the histogrm.  The second parameter is the array that contains that values for the histogram.

In order for this procedure to work properly, the array needs to be sorted for calling the histogram procedure.  This way, the maximum and the minimum values can be derived and used for setting up the bin values.  Please see the following examples for the implementation:

Normal Distribution Random Number Generator, Bootstrap - A Non-Parametric Approach,
and Monte Carlo Simulation.

Here are the codes that generate a histogram:


Sub Hist(M As Long, arr() As Single)
    Dim i As Long, j As Long
    Dim Length As Single
    ReDim breaks(M) As Single
    ReDim freq(M) As Single
   
    For i = 1 To M
        freq(i) = 0
    Next i

    Length = (arr(UBound(arr)) - arr(1)) / M
   
    For i = 1 To M
        breaks(i) = arr(1) + Length * i
    Next i
   
    For i = 1 To UBound(arr)
        If (arr(i) <= breaks(1)) Then freq(1) = freq(1) + 1
        If (arr(i) >= breaks(M - 1)) Then freq(M) = freq(M) + 1
        For j = 2 To M - 1
            If (arr(i) > breaks(j - 1) And arr(i) <= breaks(j)) Then freq(j) = freq(j) + 1
        Next j
    Next i
   
    For i = 1 To M
        Cells(i, 1) = breaks(i)
        Cells(i, 2) = freq(i)
    Next i
End Sub

The following is an example output from the procedure:



The class is the bins or the breaks.  The frequency contains the number of simulated values for each of the classes.

Here is the histogram chart from this example: