Thursday, 25 November 2010

Number crunching

The benchmark tests that were carried out last year to include code generated with VS2010 and .Net 4.0. There was a significant improvement in speed (more than 10 times) which makes a model that has been completely written in C# about 0.7 times as fast as code written in unmanaged C++. We are looking into Intel C++ compiler options to determine if we are missing something. But the outcome is that it is no longer a clear cut decision between managed C# an unmanaged C++. The 30% drop in performance could be compensated with the business value of having all code in one domain.

For completeness: It is well know that Fortran compilers are more efficient than C and are used for the core algorithms in most super computer sites. Our test cases showed that this is 5 times faster than C#/C++ solution that we have now. If we compile everything in Intel C++ we get a 30% speed up on 32 bit and a 40% speed up on 64 bit.

Although we reserve the right to write code in the most efficient language available there is a lot of value in keeping to standard well know languages. We therefore considered 3 options:

1. Everything in C++
2. Everything in C#
3. The Event loop in C# and the location loop in C++

The idea of options 1 and 2 is to have everything in the same domain. This makes it easier to optimize both the event and the location loops. This is important for the case when we want to look into per risk policies. This is because instead of having an event loop as the outer loop it makes more sense to have the exposure location loop as the outer loop since instead of summing on an event basis we are summing on a per risk basis. The point is that the exposure table is very big and it is expensive to make copies of this data, hence it is better to change the inner and outer loop.

The decision was for option 3 for the following reasons:
- Preserves existing investment
- There will always be a need for a mathematical library, having this in unmanaged code is the optimal way from a compute point of view and allows portability (eg super computers)
- The Per Risk Policies would be handled by adding new functionality to the existing library
- The readability of the code will be improved by writing some more specific functions within the mathematical library

The discussions showed that by choosing this option we do give up the following things:
- A homogenous language where there is complete freedom over how to optimize and arrange code.
- There is a need to maintain know-how in 2 languages and there is an associated barrier for scientists to understand, debug and further develop core mathematical libraries
- IT will not be able to support C++. (This is not such a big issue because a deep domain knowledge is required which would make such collaboration difficult in the first place)

With the choice of language made we can look more seriously into the structure of base classes where code could be shared between the different peril models. The base class would consist of a template method pattern that would execute methods for:

- Loading exposure
- Disaggregation
- Running the Event Loop
- Calculating Ground up loss
- Calculating loss sigma
- Applying policy structures

The idea is that the models would override the model specific operations.

For the loading of exposure to work we need to use one exposure format.

Disaggregation: the conversion of locators into a distribution of lat long point values according to distributions like population, industry exposure, Commercial and residential distribution etc seems to be an operation that is quite independent. But the models make choices such as vulnerability, soil type etc based on geographic location. It is much easier to derive this information from a locator than to determine in which location a lat long is located. Although the building of such a database sounds a simple task there are implications on how we understand the model output. Basically it is important to have a thorough understanding of disaggregation in order to understand the model results and to make an underwriting decision that makes sense.

Looking at how we want to host the models on a server farm

We are looking into service providers that would allow us to rent virtual machines on a seasonal basis. It is not clear whether we will find a provider that is willing to do this at an appropriate price.

Our computational needs are not the same as some of the Reinsurance companies that rely completely on Vendor models. Since the license fee for the HPC with RMS is significant we don’t have an immediate advantage of using HPC for RMS. The existing Scheduler for the common platform is working well therefore there is no pressure from this side.

There is value in using a scheduler that has an industry following. There is also value in choosing a scheduler that will be commonly used in our industry. There could be value in the ability to scale out on VM Roles in the cloud. From a pure number crunching point of view the first priority should be to make the number crunching modules as efficient as possible by optimizing algorithms and choice of compiler etc.

We had a discussion over distributed computing. Algorithms such as finite element analysis need to pass an entire interface between iterations and will therefore cause a lot of messages to be passed with MPI. In this case it is really important that the compute nodes very well connected. In the case of cat modeling we are passing some totals between iterations and therefore MPI messages are very small. Therefore it might not be so essential that the compute nodes are very well connected. In the case of Windstorm models algoritm speed is disk io bound because the bottle neck is on reading storm footprint files, in which case it would make sense to partition jobs based on storm footprints.