Bring Excel 2010 to Speed: Remote UDFs with Excel 2010 and HPC Server 2008 R2

A typical sceanrio in the Financial Services world is the following: A user wants to calculate some heavy stuff. In most of the cases it’s thousands of Monte Carlo scenarios. He uses Excel to read real-time data from feeds like Reuters or Bloomberg and uses a function to calculate portfolio values.
The multi-core features of Excel 2010 might help to scale the distribute workload and reduce the total calculation time. But in many cases it is not enough. Assuming a single simulation takes 1 second to calculate and assuming further you need 1,000 different simulations to calculate a portfolio then the whole calculation won’t finish before 1,000 / cores available seconds.
User Defined Functions are a key technology of Excel extensibility. UDFs allow users to create their own mathematical functions and use them in formulas just like they use Excel’s built-in functions sum, avg etc. With the introduction of Excel 2010 and HPC Server 2008 R2 users are now able to distribute UDF workload accross hundreds of physical servers inside an HPC cluster.
UDF programming is usually done in C++. I’m not a C++ expert. A customer in the banking space pointed me to an open source project called Excel DNA that he uses to implement portfolio evaluation UDFs in .NET code. Please have a look at Govert’s blog for updates regarding his project.
I’ll use the Excel DNA project now to demonstrate the functionality of remote UDFs, Excel 2010 and HPC Server 2008 R2:

Start the C# Coding

The sample application I’m going to develop now contains 2 parts: a math part and the UDF part. The application will allow users to forecast stock prices based on historical prices. Assuming the Microsoft stock price was 23$, 23.45$, 24.12$, 23.23$ etc. in the past days, the application will predict the stock price by using a magic glass bowl :-)
The .NET code will we implemented in a simple class library.

The Math Part

The math part of the application was not written by a Quant. It was written by me – and I am the worst mathematican you can find on earth. I’m creating some code that will put some workload on the CPU now.
The core part of the application is a forecast tool that tries to predict tomorrow’s stock price based on a double array of historic input prices.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace MySuperCoolRemoteUDF.MathTools
{
 public class MCForecastTool
 {
  ...
  public MCForecastTool(List<double> Prices)
  {
    ...
    Initialize();
  }

  public double PedictNextPrice()
  {
    double absolutePriceChange = lastPrice - beforeLastPrice;

    double rnd1 = RandomNumberGenerator.NextSystem();
    double rnd2 = RandomNumberGenerator.NextSystem();

    double propUpTomorrow = 1d - IndicatorState.ProbUp;
    double propDownTomorrow = IndicatorState.ProbUp;

    short tomorrowsDirection = (short)0;
    short direction = absolutePriceChange < 0 ? (short)-1 : (short)1;
    ...
    double priceChange = CalculateCurrentAVG();

    double lambda = 1d / priceChange;

    double rnd3 = RandomNumberGenerator.NextExponential(lambda);
    double priceTomorrow = ...;

    return priceTomorrow;
  }

  ...
 }
}

The UDF Part

After implementing the mathematical part it is now time to create a UDF using .NET and Excel-DNA. To enable an integration add the Excel-DNA integration .dll to the current Visual Studio project.
Excel-DNA will publish all public static methods as UDFs in Excel that are tagged as ExcelFunction (line 11). Remote UDFs need to set the IsClusterSafe-attribute to true. The UDF named NextPricePlease will accept two parameters: the number of Monte Carlo runs (iterations) and the historical prices that arrive from cell values as a multi-dimensional object array (HistoricalPrices). That’s pretty it! :-)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using MySuperCoolRemoteUDF.MathTools;
using ExcelDna.Integration;
namespace MySuperCoolRemoteUDF
{
 public class MonteCarloRunner
 {
   [ExcelFunction(Description = "Glass bowls the next stock price",
Category = "My UDFs", IsClusterSafe=true)]
   public static double NextPricePlease(double Iterations, object[,] HistoricalPrices)
   {
     List<double> Prices = new List<double>(HistoricalPrices.Length);
     // read the Excel Cells:
     foreach (object HistoricalPriceObj in HistoricalPrices)
     {
        double Price = (double) HistoricalPriceObj;
        Prices.Add(Price);
     }
     int IterationsInt = (int)Iterations;
     List<MCForecastTool> MCToolList = new List<MCForecastTool>(IterationsInt);
     // Create N-many forecast tools:
     for (int i = 0; i < IterationsInt; i++)
        MCToolList.Add(new MCForecastTool(Prices));
     double tempSum = 0d;
     // Predict N-many forecast prices
     foreach (MCForecastTool MCFT in MCToolList)
         tempSum += MCFT.PedictNextPrice();
     // Tomorrow's price is the mean of all predictions:
     double NextPrice = tempSum / (double)IterationsInt;
     return NextPrice;
   }
}
}

Configure Excel-DNA

After compiling the C# code it’s time to configure Excel-DNA in the right way. Since the code was compiled to a *.dll we just need to crate an XML file with a unique name:  MySuperCoolRemoteUDF.dna. Further we need to copy the ExcelDna.xll as MySuperCoolRemoteUDF.dll to the same directory (see picture). A more detailed Excel-DNA how-to is discussed in Govert’s forum.

<DnaLibrary>
 <ExternalLibrary Path="MySuperCoolRemoteUDF.dll" />
</DnaLibrary>

Use the UDF in Excel 2010

Local UDF Run

By default every UDF will be executed locally. A user must explicitely activate remote UDF execution on an HPC Server 2008 R2 cluster. Let’s use a local run first to create a new Excel 2010 workbook. As you can see in the screenshot below the spreadsheet will contain historical prices of 20 stocks. These prices could be real- or near-time prices but in our sample they are randomized. Starting from Now we will use the SuperCoolRemoteUDF to predict tomorrows price of every stock. The day after tomorrow will be predicted based on the historical prices from row B to row L plus the predicted price of row M. Thus we have dependencies between every cell starting from row N to row Q.

The local computation of one cell (like M8) takes approximately 6.8 seconds using 250,000 iterations. To calculate every cell row M, N, O, P and Q in parallel we need at least a 20 core machine. Assuming a typical 4 core workstation the whole matrix will be calculated within 20 / 4 * 5 * 6,8 = 170 seconds.

Remote UDF Run

Using an HPC Server 2008 R2 cluster we can scale the computational power to thousands of cores. If there were no dependencies between the cells we would be able to calculate every cell in parallel within the cluster. But since there are dependencies between the rows M to Q we can only calculate row by row.

To enable a parallel run we just need to allow user-defined XLL functions to run on a cluster. That’s it. The spreadsheet will immediately re-calculate its values once they update on a cluster. This reduces the pure computational time from 170 seconds to 20 / 20 * 5 * 6,8 = 35 seconds. Approximately 3-5 seconds need to be added to create a session and send and receive the message requests.

Advertisement
This entry was posted in Uncategorized. Bookmark the permalink.

4 Responses to Bring Excel 2010 to Speed: Remote UDFs with Excel 2010 and HPC Server 2008 R2

  1. Duncan says:

    Thanks for this. I have set up a cluster to run some Quant code, once I used the correct version of Excel DNA all worked perfectly. Version 59433 (from codeplex) works perfectly.

  2. Rob says:

    When passing a UDF along to a cluster are there any restrictions or drawback associated with code that already has threading in it? Does it matter if my function were something like

    [ExcelFunction(IsClusterSafe=true)]
    public static double test()
    {
    double r1 = 0;
    Thread t1 = new Thread(new ThreadStart(() => r1 = expensiveFunction()));
    t1.Start();

    double r2 = 0;
    Thread t2 = new Thread(new ThreadStart(() => r2 = expensiveFunction()));
    t2.Start();

    t1.Join();
    t2.Join();

    return r1 + r2;
    }

    Or should I leave the workload distribution to the cluster? I suppose I’m curious as to what qualifies as “cluster safe”.

    I’m new to all things cluster related, so pardon my ignorance. Any help would be appreciated.

    Thanks.

    Rob

    • Rob, you should distrubute what makes sense :-)
      ie. I’ve seen that running 12 processes on a 12 core machine with core affinity executes faster than 1 process with 12 threads on the same machine. Clusters are build not bought. This means you need to test what is faster. Run it with both expensiveFunctions in 1 UDF call and run it with both expensiveFunctions – 1 in 1 UDF call each…

      Hope this makes sense. Torsten

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s