Wednesday 20 July 2011

Some notes on the application of Cloud computing

I am just about to go on holiday. So before I forget here are ideas about the application of Cloud Computing to Catastrophe Modeling.

Cloud computing makes sense for Cat simulations for various reasons. The costs are very reasonable, for example all my tests came to about 82 CHF compared to the fixed cost of a VM in a datacenter that mounts up to several thousand franks. Microsoft can achieve this by making the administration of the hundreds of thousands of VMs as automated as possible thus getting an economy of scale that we don't have in smaller data centers. Creating the cloud VMs took a few minutes as opposed to weeks via our internal/external processes. I could imagine that WMWare will improve the provisioning of virtual machines and that one day this may be as simple as filling out a web site. But we don't have such a system yet and if we did we would have a much more limited pool of servers and therefore higher costs than if the whole thing was in the cloud. One last aspect was that surprisingly the end to end time to process the bench mark cat model in the cloud took less time than on our on-premis servers.

There are some different possibilities how number crunching processes can be implemented. I have seen solutions that can call an executable, if this executable needs something to be installed on the machine it is possible to configure setup tasks that install software as the VM is being instantiated. In the context of my modeling platform I think I would implement the job submission in the same way as in the prototype I described earlier. By this I mean I would transfer the data to be simulated as a blob and once completed add a reference to this data in a Queue that the work roles pole for work. The difference would be instead of poling the results queue I would expose an on premise WCF service over http and call this from the cload using claims based tokens. I would use Queues for status information because they have an intrinsic order, but I would send this information via one way calls to the on premise web services. Since security is made using claims based security the role of the firewall changes slightly. The reason is that cloud apps need to connect via http or https to internal services and that the services and not the firewall will carry out authentication. So there is a shift of responsibility from the firewall the services

From my experience of uploading custom VMs into the cloud I am not sure how well HPC with burst into Azure works in practice. I will follow up on this at the Build conference.

Thinking about how Cloud computing can be integrated into an organization there are 3 aspects to consider Network, Storage and Compute.

A means of synchronizing data between office and mobiles devices brings substantial benefits. Some years ago I tried a CTP of Microsoft Mesh which enabled the synchronization of data between devices. Apple recently released a similar cloud service. There was a demo at the PDC which showed a business man losing his PC but because all his configuration and working data is continuously in synch it was possible to take a new PC and carry on where he left off. Since the CTP I have seen a version of Mesh in Hotmail and Office 360. Although it is difficult to come up with a single business case where this is useful it does add to productivity because it improves the functionality of the environment that we work in.

On the network idea there are datacenters around the world making a global presence much easier to maintain. On the other hand there is latency in getting data to and from the cloud. This makes the SaaS Team Foundation Server offering very interesting.

I think cloud computing will bring about a shift from relational databases to a more object orientated data storage model. Relational databases are not intrinsically more performant. At present I believe the performance of properly dimensioned on premise sql servers out-perform the cloud sql servers. This will probably change SQL Azure developes and as technologies like TriadLinq become available enabling the distribution of the workload needed in carrying out database queries. The reason is that a central sql server is a bottle neck, where as in a cloud the compute needed to carry out queries is by design made for scalability.

With this in mind I think it's worthwhile to encourage the development of web based apps or apps that can be deployed from the web because these fit easily into the Cloud. We would need to manage the scope of the broad set of mobile devices that may or may not be supported in our environment. In particular Rapid Application Development in the context of Microsoft orientated companies which would favor Windows mobile devices even though these devices have not been on the market long enough to really take a large market share. To develop iPad and iPhone applications means needing an Apple workstation learning C and a whole new API.

When looking at data It is difficult to categorize data into security groups. The reason is that depending on the application the data either be sensitive or non sensitive depending on it's context. It is therefore more practical to make security groups for applications