Thursday, 25 November 2010
Number crunching
For completeness: It is well know that Fortran compilers are more efficient than C and are used for the core algorithms in most super computer sites. Our test cases showed that this is 5 times faster than C#/C++ solution that we have now. If we compile everything in Intel C++ we get a 30% speed up on 32 bit and a 40% speed up on 64 bit.
Although we reserve the right to write code in the most efficient language available there is a lot of value in keeping to standard well know languages. We therefore considered 3 options:
1. Everything in C++
2. Everything in C#
3. The Event loop in C# and the location loop in C++
The idea of options 1 and 2 is to have everything in the same domain. This makes it easier to optimize both the event and the location loops. This is important for the case when we want to look into per risk policies. This is because instead of having an event loop as the outer loop it makes more sense to have the exposure location loop as the outer loop since instead of summing on an event basis we are summing on a per risk basis. The point is that the exposure table is very big and it is expensive to make copies of this data, hence it is better to change the inner and outer loop.
The decision was for option 3 for the following reasons:
- Preserves existing investment
- There will always be a need for a mathematical library, having this in unmanaged code is the optimal way from a compute point of view and allows portability (eg super computers)
- The Per Risk Policies would be handled by adding new functionality to the existing library
- The readability of the code will be improved by writing some more specific functions within the mathematical library
The discussions showed that by choosing this option we do give up the following things:
- A homogenous language where there is complete freedom over how to optimize and arrange code.
- There is a need to maintain know-how in 2 languages and there is an associated barrier for scientists to understand, debug and further develop core mathematical libraries
- IT will not be able to support C++. (This is not such a big issue because a deep domain knowledge is required which would make such collaboration difficult in the first place)
With the choice of language made we can look more seriously into the structure of base classes where code could be shared between the different peril models. The base class would consist of a template method pattern that would execute methods for:
- Loading exposure
- Disaggregation
- Running the Event Loop
- Calculating Ground up loss
- Calculating loss sigma
- Applying policy structures
The idea is that the models would override the model specific operations.
For the loading of exposure to work we need to use one exposure format.
Disaggregation: the conversion of locators into a distribution of lat long point values according to distributions like population, industry exposure, Commercial and residential distribution etc seems to be an operation that is quite independent. But the models make choices such as vulnerability, soil type etc based on geographic location. It is much easier to derive this information from a locator than to determine in which location a lat long is located. Although the building of such a database sounds a simple task there are implications on how we understand the model output. Basically it is important to have a thorough understanding of disaggregation in order to understand the model results and to make an underwriting decision that makes sense.
Looking at how we want to host the models on a server farm
We are looking into service providers that would allow us to rent virtual machines on a seasonal basis. It is not clear whether we will find a provider that is willing to do this at an appropriate price.
Our computational needs are not the same as some of the Reinsurance companies that rely completely on Vendor models. Since the license fee for the HPC with RMS is significant we don’t have an immediate advantage of using HPC for RMS. The existing Scheduler for the common platform is working well therefore there is no pressure from this side.
There is value in using a scheduler that has an industry following. There is also value in choosing a scheduler that will be commonly used in our industry. There could be value in the ability to scale out on VM Roles in the cloud. From a pure number crunching point of view the first priority should be to make the number crunching modules as efficient as possible by optimizing algorithms and choice of compiler etc.
We had a discussion over distributed computing. Algorithms such as finite element analysis need to pass an entire interface between iterations and will therefore cause a lot of messages to be passed with MPI. In this case it is really important that the compute nodes very well connected. In the case of cat modeling we are passing some totals between iterations and therefore MPI messages are very small. Therefore it might not be so essential that the compute nodes are very well connected. In the case of Windstorm models algoritm speed is disk io bound because the bottle neck is on reading storm footprint files, in which case it would make sense to partition jobs based on storm footprints.
My first steps in Azure Cloud Computing
1. Development Environment
-> Now works on PDC laptop
2. Storage Model
- Blobs
- Tables
- Cache
3. How to initialize data in the cloud
4. How to extract data from the cloud
5. Simple UI
6. Data Access layer
7. Worker Roles
8. Cache
9. Test Driven Development
Setting up the development environment has some hurdles. Firstly it does not seem to work on WindowsXP and it requires IIS7. Here's a list of steps:
1. Open VS2010
2. Select the Cload Project
3. Click on Download Windows Azure Tools
Windows Azure Tools for Microsoft Visual Studio 1.2 (June 2010)
4. Click Download
5. Run
6. Run
This time it worked well. When you run an Azure application in debug mode you need to start the AppFabric service on your PC. To do this there is a component installed ion the Azure SDK section of the start menu.
I had no problems in starting the Development AppFabric but could not start the development storage
To get around this I ran c:\Program Files\Windows Azure SDK\v1.2\bin\devstore\dsservice.exe which produced an error:
A network-related or instance-specific error occured while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that the SQL Server is configured to allow remote connections.( provider SQL Network Interfaces, error 26 - Error Locating Server/Instance Specified)
I started the SQL Server Service and tried again. and I was able to run my hallo world Azure application.
The next thing I wanted to do is see how easy it would be to deploy this to Azure To do this I did the following steps
1. Right mouse click cloud project an select publish
2. create a certificate and install on pc
3. export certificate to file
4. import certificate to azure
5. create storage and servce accounts
6. This takes a long time
Then I got the warning:
WARNING: This deployment has at least one role with only one instance. We recommend that you deploy at least two instances per role to ensure high availability in case one of the instances becomes unavailable. Doing so also enables coverage of the Windows Azure Compute SLA, which guarantees 99.95% uptime. For more information please visit this link.
To correct this:
1. Goto WebRoleTable under Roles in the TableCloadService subproject
2. Increase the instance count by 1 to 2
This got rid of the error message but the package could not be deployed to staging.
Some hallo world experiments later I got a message from Microsoft:
Sehr geehrte(r) Nigel Findlater!
Diese E-Mail-Benachrichtigung soll Sie über Ihre Nutzung von Windows Azure Platform informieren. Nach unseren Aufzeichnungen hat Ihr Abonnement 125 % der Angebots-Servernutzungszeit für den aktuellen Abrechnungszeitraum überschritten. Für alle Stunden, die die im Abonnement inbegriffenen Zeiten übersteigen, fallen Standardgebühren an.
Nachstehend sind die Nutzungswerte bis zum jetzigen Zeitpunkt im Rahmen des laufenden Abrechnungszeitraums aufgeführt.
: 42.000000
Soweit dieser Wert nicht unerwartet ist, sind keine Handlungen Ihrerseits erforderlich. Sollte der Nutzungswert unerwartet sein, melden Sie sich beim Windows Azure Dev-Portal an, um Ihre derzeit beanspruchten Dienste einzusehen, und nehmen Sie die erforderlichen Änderungen an diesen Diensten vor, um Ihre Nutzung wieder auf einen Nominalwert zu bringen.
Sie können sich jederzeit beim Microsoft Online Services-Kundenportal unter https://mocp.microsoftonline.com anmelden, um die Nutzungszeiten Ihres Abonnements einzusehen. Wenn Sie hier klicken, erhalten Sie detaillierte Anweisungen dazu, wie die Nutzungsinformationen auf Ihren Rechnungen zu lesen und auszulegen sind.
Antworten Sie nicht auf diese E-Mail. Das Postfach wird nicht überwacht. Wenn Sie Kundensupport benötigen, wenden Sie sich an einen Customer Partner Care-Mitarbeiter, indem Sie auf den folgenden
8<--
Name Ressource Verbraucht Inklusive Fakturierbar Satz Betrag
Windows Azure Compute Rechenstunden 43.000000 25.000000 18.000000 0.132000 CHF 2.38
Windows Azure-Speicher Speichertransaktionen (in 10.000) 0.027800 1.000000 0.000000 0.011000 CHF 0.00
This translates to 2.50 CHF which won't break the bank. I am surprised that a couple of Hallo world applications deployed to staging could run up more than 50 hours of compute time. To be sure that the time did not build up any more I delete the application and services. Next I am going to take a much closer look at http://msdn.microsoft.com/en-us/windowsazure/ff796218.aspx and order a book.
Thursday, 11 November 2010
Notes from PDC2010
Entity Framework
{
public User()
{
Chirps = new List<Chirps>();
}
public int Id { get; set; }
public string Name { get; set; }
// public ICollection<Chirp> Users { get; set; }
public virtuel ICollection<Chirp> Users { get; set; } // For lazy loading
}
{
public int Id { get; set; }
[StringLength(255, MinimumLength = 1)]
public string Message { get; set; }
public DatTime When { get; set; }
public virtual User User { get; set; }
public int ChirpActivity {
get
{
return (from c in Chirps
where c.When >DateTime.Now.addDay(1);
select c).count();
}
}
}
Design blog: http://blogs.msdn.com/efdesign/
Microsoft announced Visual Studio Team Foundation Server on Windows Azure
https://datamarket.azure.com
Wednesday, 3 November 2010
An example of paired programing in making a deployment utility
When I wrote this I was sitting in an airplane on my way to the PDC and I cannot sleep. So it is time to write on my blog. Last week in Paris I had a great experience doing paired programming. Here is what we did.
We started off from some Use cases that described things that we want to automate when we make our deployment of the modeling platform. For example:
- Asynchronous update of storm foot print files
- Synchronous build of server
- Synchronous deleting of databases
- Etc
Based on these Use Cases we set to work on the sequence diagrams of what we want the deployment tool to do. We came to the notion of a task that would have inputs, results and an execute method. Then we came to the notion of an asynchronous task that would inherit from the task and would have the additional property of a list of targets. We then started to think about the components that would make up the system. Keeping simplicity in mind, or in other words to treat simplicity as a feature we decided that the deployment mechanism should not be distributed unless we find no alternative. So we came up with a TaskDispatcher that would process a series of tasks and a scheduler that could do higher level operations like schedule tasks to run.
So with this ground work we set about building the interfaces that would be needed as input to the tasks and we came up with:
namespace PartnerRe.Deployment.Contracts
{
public interface IParameters
{
string Description { get; set; }
string SourcePath { get; set; }
string DestinationPath { get; set; }
bool FailOnFirstError { get; set; }
TimeSpan TaskTimeOut { get; set; }
}
}
We compared this interface with the list of Use cases and found that with some interpretation from the custom task implementations this would be sufficient for inputs required by the Tasks.
Next we looked at the outputs and came up with:
namespace PartnerRe.Deployment.Contracts
{
public interface IExecutionResults
{
bool Success { get; set; }
string Message { get; set; }
string LogFilePath { get; set; }
}
}
Again we considered each use case in turn and came to the conclusion that this would be enough.
Next we thought about how to setup the task structures. We decided to use abstract classes for the Tasks, in this way we have a class structure that is easier to extend. Starting with the Synchonous TaskBase we came up with:
namespace PartnerRe.Deployment.Contracts
{
public abstract class TaskBase : PartnerRe.Deployment.Contracts.ITaskBase
{
public TaskEnumeration Task = TaskEnumeration.None;
public TaskType TaskType = TaskType.Synchronous;
public IParameters Parameters;
public IExecutionResults ExecutionResults;
public abstract void TaskOperation();
public void Execute()
{
StartTime = DateTime.Now;
TaskOperation();
EndTime = DateTime.Now;
}
public DateTime StartTime { get; private set; }
public DateTime EndTime { get; private set; }
}
}
The implementation of the ITaskBase interface is a little overkill. We decided we need something to describe where the task is synchronous or asynchronous as well as an enumeration to identify what the task is. The Execute function implements a template method pattern where the programmer must implement the TaskOperation Method.
The Asynchonous task looks like:
namespace PartnerRe.Deployment.Contracts
{
public abstract class AsynchronousTaskBase : TaskBase
{
public IList<string> Targets;
public string CurrentTarget { get; set; }
public AsynchronousTaskBase()
: base()
{
this.TaskType = Contracts.TaskType.Asynchrounous;
}
}
}
Notice we have the additional list of targets and CurrentTarget properties. We thought for a while over whether we would rather implement a list of tasks. We decided that the above was simpler because all we are interested in is a list of results and not a list of tasks that also include the Input parameters and Execution methods. Our idea is to setup one task with place holders for the list of targets
Next we want to implement the Task dispatcher and we came up with the following
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using PartnerRe.Deployment.Contracts;
using System.Threading.Tasks;
using System.Threading;
namespace PartnerRe.Deployment
{
public class TaskDispatcher
{
private List<object> taskDaisyChain;
private List<object> targets;
public TaskDispatcher()
{
taskDaisyChain = new List<object>;
}
public void AddTask<T>(T task)
{
this.taskDaisyChain.Add(task);
}
public void ExecuteTasks()
{
for (int i = 0; i < taskDaisyChain.Count; i++)
{
TaskBase currentTask = this.taskDaisyChain[i] as TaskBase;
//object task = this.TaskDaisyChain[i];
if (currentTask.TaskType == TaskType.Synchronous)
{
currentTask.Execute();
}
if (currentTask.TaskType == TaskType.Asynchrounous)
{
if (targets == null)
{
throw new ArgumentException("Asynchronous tasks needs a list of targets to run on");
}
if (targets.Count == 0)
{
throw new ArgumentException("Asynchronous tasks needs a list of targets to run on");
}
AsynchronousTaskBase taskAsynchronous = this.taskDaisyChain[i] as AsynchronousTaskBase;
taskAsynchronous.Targets = targets;
IExecutionResults[] executionResults = new IExecutionResults[targets.Count];
Parallel.For(0, targets.Count, x =>
{
TaskFactory taskFactory = new TaskFactory();
AsynchronousTaskBase taskParallel = taskFactory.CreateTask(taskAsynchronous.Task) as AsynchronousTaskBase;
taskParallel.Parameters = taskAsynchronous.Parameters;
taskParallel.CurrentTarget = taskAsynchronous.Targets[x];
taskParallel.Targets = taskAsynchronous.Targets;
taskParallel.Execute();
lock (executionResults)
{
executionResults[x] = taskParallel.ExecutionResults;
}
}
);
taskAsynchronous.ExecutionResults.Message = "";
taskAsynchronous.ExecutionResults.Success = true;
for (int j = 0; j < targets.Count; j++)
{
taskAsynchronous.ExecutionResults.Message += executionResults[j].Message;
if (!executionResults[j].Success)
{
taskAsynchronous.ExecutionResults.Success = false;
}
}
}
}
}
public void AddListOfTargets(List<string> Targets)
{
this.targets = Targets;
}
}
}
We decided that we would need a queue of tasks implement above as a daisy chain. There is some scope for refactoring for example we have implemented another list of targets here. But the design principles are sound. Also probably the lock in the parallel for is over kill but it is a shared variable and in this case the time to execute the tasks are far greater than the time lost in locking the results. The syntax around the parallel for was a little different, we did not find many examples for to implement an itterator.
We implemented a one way file synchronization task that looks like:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using PartnerRe.Deployment.Contracts;
using System.IO;
using System.Diagnostics;
using System.ComponentModel;
using System.ComponentModel.Composition;
namespace PartnerRe.Deployment
{
[Export(typeof(TaskBase))]
public class SynchronizeDirectoryTask : AsynchronousTaskBase
{
private string WorkingDirectory = "C:\\Program Files\\Windows Resource Kits\\Tools";
public SynchronizeDirectoryTask() : base()
{
this.Task = TaskEnumeration.SynchronizeDirectory;
this.Parameters = new Parameters();
this.ExecutionResults = new ExecutionResults();
}
public override void TaskOperation()
{
if (this.Task == TaskEnumeration.None)
{
throw new Exception("The programmer forgot to set the Task enumeration in the Task constructor");
}
//… Lots more tests with ArgumentExceptions
//C:\Program Files\Windows Resource Kits\Tools\Robocopy.exe
// robocopy Source Destination *.* /XO
this.Parameters.DestinationPath = this.Parameters.DestinationPath.Replace("<SERVER>", this.CurrentTarget);
ProcessStartInfo pCopy = new ProcessStartInfo();
pCopy.WorkingDirectory = WorkingDirectory;
pCopy.FileName = "Robocopy.exe";
pCopy.UseShellExecute = false;
pCopy.RedirectStandardOutput = true;
pCopy.RedirectStandardError = true;
pCopy.Arguments = this.Parameters.SourcePath + " "+this.Parameters.DestinationPath+" *.* /XO";
Process proc = Process.Start(pCopy);
string output = proc.StandardOutput.ReadToEnd();
proc.WaitForExit();
string error = proc.StandardError.ReadToEnd();
proc.WaitForExit();
this.ExecutionResults.Message = output;
this.ExecutionResults.LogFilePath = "\\\\" + this.CurrentTarget + "\\c$\\MyLog";
this.ExecutionResults.Success = true;
}
}
}
Here we wanted to program as little as possible, so we took Robocopy and redirected the standard output. We also implemented a MEF Export. The corresponding object factory looks like:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using PartnerRe.Deployment.Contracts;
using System.ComponentModel;
using System.ComponentModel.Composition;
using System.ComponentModel.Composition.Hosting;
namespace PartnerRe.Deployment
{
public class TaskFactory
{
public TaskFactory()
{
DirectoryCatalog directoryCatalog = new DirectoryCatalog(@".");
CompositionBatch batch = new CompositionBatch();
batch.AddPart(this);
CompositionContainer container = new CompositionContainer(directoryCatalog);
container.Compose(batch);
}
[ImportMany(typeof(TaskBase))]
private IEnumerable<TaskBase> Tasks;
public TaskBase CreateTask(TaskEnumeration TaskType)
{
foreach(TaskBase t in Tasks)
{
if (t.Task == TaskType)
{
switch (t.TaskType)
{
case Contracts.TaskType.Asynchrounous:
{
return t as AsynchronousTaskBase;
}
case Contracts.TaskType.Synchronous:
{
return t as TaskBase;
}
}
}
}
throw new Exception("This task has not yet been implemented in the TaskFactory");
}
}
}
The idea is that when we add new tasks we want them to be imported automatically into our application with the least amount of manual programming as possible. In this case we only need to add to the TaskEnumeration each time we add a new task.
We developed using Test Driven Development, this was used as a way to design the API. In the end our test looked like:
[TestMethod]
public void SynchronizeDirectory_UnsynchronizedFiles_TargetFilesSyncronized()
{
// Arrange
DestinationDir = "\\\\<SERVER>\\D$\\SOURCE\\PartnerRe.Deployment\\TestFiles\\Source";
TaskDispatcher taskDispatcher = new TaskDispatcher();
SynchronizeDirectoryTask synchronizeDirectoryTask = new SynchronizeDirectoryTask();
synchronizeDirectoryTask.Parameters.DestinationPath = DestinationDir;
synchronizeDirectoryTask.Parameters.SourcePath = SourceDir;
taskDispatcher.AddTask<TaskBase>(synchronizeDirectoryTask);
List<string> Targets = new List<string>();
Targets.Add("CHZUPRELR886W9X");
//Targets.Add("Server2");
taskDispatcher.AddListOfTargets(Targets);
// Act
taskDispatcher.ExecuteTasks();
//// Assert
Assert.IsTrue(synchronizeDirectoryTask.ExecutionResults.Success);
}
As can be seen the test is not complete, but the method is sound.In this case the experience I made with Paired programming showed it is an efficient method of programming. It has the advantage that it integrates at least 2 people directly into the decision making process of building and refactoring an architecture while sharing knowledge. The only time that I found paired programming not to be appropriate is when the programming method is unclear, for example when intensive googling is needed to resolve a small detail.