Wednesday 3 November 2010

An example of paired programing in making a deployment utility

When I wrote this I was sitting in an airplane on my way to the PDC and I cannot sleep. So it is time to write on my blog. Last week in Paris I had a great experience doing paired programming. Here is what we did.

We started off from some Use cases that described things that we want to automate when we make our deployment of the modeling platform. For example:

  • Asynchronous update of storm foot print files
  • Synchronous build of server
  • Synchronous deleting of databases
  • Etc

Based on these Use Cases we set to work on the sequence diagrams of what we want the deployment tool to do. We came to the notion of a task that would have inputs, results and an execute method. Then we came to the notion of an asynchronous task that would inherit from the task and would have the additional property of a list of targets. We then started to think about the components that would make up the system. Keeping simplicity in mind, or in other words to treat simplicity as a feature we decided that the deployment mechanism should not be distributed unless we find no alternative. So we came up with a TaskDispatcher that would process a series of tasks and a scheduler that could do higher level operations like schedule tasks to run.

So with this ground work we set about building the interfaces that would be needed as input to the tasks and we came up with:

namespace PartnerRe.Deployment.Contracts

    public interface IParameters
    {
        string Description { get; set; }
        string SourcePath { get; set; }
        string DestinationPath { get; set; }
        bool FailOnFirstError { get; set; }
        TimeSpan TaskTimeOut { get; set; }
    }
}

We compared this interface with the list of Use cases and found that with some interpretation from the custom task implementations this would be sufficient for inputs required by the Tasks.

Next we looked at the outputs and came up with:

namespace PartnerRe.Deployment.Contracts
{
    public interface IExecutionResults
    {
        bool Success { get; set; }
        string Message { get; set; }
        string LogFilePath { get; set; }
    }
}

Again we considered each use case in turn and came to the conclusion that this would be enough.

Next we thought about how to setup the task structures. We decided to use abstract classes for the Tasks, in this way we have a class structure that is easier to extend. Starting with the Synchonous TaskBase we came up with:

namespace PartnerRe.Deployment.Contracts
{
    public abstract class TaskBase : PartnerRe.Deployment.Contracts.ITaskBase
    {
        public TaskEnumeration Task = TaskEnumeration.None;
        public TaskType TaskType = TaskType.Synchronous;
        public IParameters Parameters;
        public IExecutionResults ExecutionResults;
        public abstract void TaskOperation();
        public void Execute()
        {
            StartTime = DateTime.Now;
            TaskOperation();
            EndTime = DateTime.Now;
        }
        public DateTime StartTime { get; private set; }
        public DateTime EndTime { get; private set; }
    }
}

The implementation of the ITaskBase interface is a little overkill. We decided we need something to describe where the task is synchronous or asynchronous as well as an enumeration to identify what the task is. The Execute function implements a template method pattern where the programmer must implement the TaskOperation Method.

The Asynchonous task looks like:

namespace PartnerRe.Deployment.Contracts
{
    public abstract class AsynchronousTaskBase : TaskBase
    {
        public IList<string> Targets;
        public string CurrentTarget { get; set; }
        public AsynchronousTaskBase()
            : base()
        {
            this.TaskType = Contracts.TaskType.Asynchrounous;
        } 
    }
}

Notice we have the additional list of targets and CurrentTarget properties. We thought for a while over whether we would rather implement a list of tasks. We decided that the above was simpler because all we are interested in is a list of results and not a list of tasks that also include the Input parameters and Execution methods. Our idea is to setup one task with place holders for the list of targets

Next we want to implement the Task dispatcher and we came up with the following

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using PartnerRe.Deployment.Contracts;
using System.Threading.Tasks;
using System.Threading;

namespace PartnerRe.Deployment
{
   public class TaskDispatcher
   {
      private List<object> taskDaisyChain;
      private List<object> targets;

      public TaskDispatcher()
      {
          taskDaisyChain = new List<object>;
      }
      public void AddTask<T>(T task)

{
            this.taskDaisyChain.Add(task);
        }
        public void ExecuteTasks()
        {
            for (int i = 0; i < taskDaisyChain.Count; i++)
            {
                TaskBase currentTask = this.taskDaisyChain[i] as TaskBase;

                //object task = this.TaskDaisyChain[i];
                if (currentTask.TaskType == TaskType.Synchronous)
                {
                    currentTask.Execute();
                }
                if (currentTask.TaskType == TaskType.Asynchrounous)
                {
                    if (targets == null)
                    {
                        throw new ArgumentException("Asynchronous tasks needs a list of targets to run on");
                    }
                    if (targets.Count == 0)
                    {
                        throw new ArgumentException("Asynchronous tasks needs a list of targets to run on");
                    }

                    AsynchronousTaskBase taskAsynchronous = this.taskDaisyChain[i] as AsynchronousTaskBase;
                    taskAsynchronous.Targets = targets;

                    IExecutionResults[] executionResults = new IExecutionResults[targets.Count];

                    Parallel.For(0, targets.Count, x =>
                    {
                        TaskFactory taskFactory = new TaskFactory();
                        AsynchronousTaskBase taskParallel = taskFactory.CreateTask(taskAsynchronous.Task) as AsynchronousTaskBase;
                        taskParallel.Parameters = taskAsynchronous.Parameters;
                        taskParallel.CurrentTarget = taskAsynchronous.Targets[x];
                        taskParallel.Targets = taskAsynchronous.Targets;
                        taskParallel.Execute();
                        lock (executionResults)
                        {
                            executionResults[x] = taskParallel.ExecutionResults;
                        } 
                    }
                    );

                    taskAsynchronous.ExecutionResults.Message = "";
                    taskAsynchronous.ExecutionResults.Success = true;
                    for (int j = 0; j < targets.Count; j++)
                    {
                        taskAsynchronous.ExecutionResults.Message += executionResults[j].Message;
                        if (!executionResults[j].Success)
                        {
                            taskAsynchronous.ExecutionResults.Success = false;
                        }
                    } 
                }
            } 
        }
        public void AddListOfTargets(List<string> Targets)
        {
            this.targets = Targets;
        } 
    }
}

We decided that we would need a queue of tasks implement above as a daisy chain. There is some scope for refactoring for example we have implemented another list of targets here. But the design principles are sound. Also probably the lock in the parallel for is over kill but it is a shared variable and in this case the time to execute the tasks are far greater than the time lost in locking the results. The syntax around the parallel for was a little different, we did not find many examples for to implement an itterator.

We implemented a one way file synchronization task that looks like:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using PartnerRe.Deployment.Contracts;
using System.IO;
using System.Diagnostics;
using System.ComponentModel;
using System.ComponentModel.Composition;

namespace PartnerRe.Deployment
{
    [Export(typeof(TaskBase))]
    public class SynchronizeDirectoryTask : AsynchronousTaskBase
    {

        private string WorkingDirectory = "C:\\Program Files\\Windows Resource Kits\\Tools";

        public SynchronizeDirectoryTask() : base()
        {
            this.Task = TaskEnumeration.SynchronizeDirectory;
            this.Parameters = new Parameters();
            this.ExecutionResults = new ExecutionResults();
        }

        public override void TaskOperation()
        {
            if (this.Task == TaskEnumeration.None)
            {
                throw new Exception("The programmer forgot to set the Task enumeration in the Task constructor");
            }
//… Lots more tests with ArgumentExceptions

            //C:\Program Files\Windows Resource Kits\Tools\Robocopy.exe
            // robocopy Source Destination *.* /XO

            this.Parameters.DestinationPath = this.Parameters.DestinationPath.Replace("<SERVER>", this.CurrentTarget);

            ProcessStartInfo pCopy = new ProcessStartInfo();
            pCopy.WorkingDirectory = WorkingDirectory;
            pCopy.FileName = "Robocopy.exe";
            pCopy.UseShellExecute = false;
            pCopy.RedirectStandardOutput = true;
            pCopy.RedirectStandardError = true;
            pCopy.Arguments = this.Parameters.SourcePath + " "+this.Parameters.DestinationPath+" *.* /XO";
            Process proc = Process.Start(pCopy);

            string output = proc.StandardOutput.ReadToEnd();
            proc.WaitForExit();

            string error = proc.StandardError.ReadToEnd();
            proc.WaitForExit();

            this.ExecutionResults.Message = output;
            this.ExecutionResults.LogFilePath = "\\\\" + this.CurrentTarget + "\\c$\\MyLog";
            this.ExecutionResults.Success = true;

        }
    }
}

Here we wanted to program as little as possible, so we took Robocopy and redirected the standard output. We also implemented a MEF Export. The corresponding object factory looks like:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using PartnerRe.Deployment.Contracts;
using System.ComponentModel;
using System.ComponentModel.Composition;
using System.ComponentModel.Composition.Hosting;

namespace PartnerRe.Deployment
{
    public class TaskFactory
    {
        public TaskFactory()
        {
            DirectoryCatalog directoryCatalog = new DirectoryCatalog(@".");
            CompositionBatch batch = new CompositionBatch();
            batch.AddPart(this);
            CompositionContainer container = new CompositionContainer(directoryCatalog);
            container.Compose(batch);
        }

        [ImportMany(typeof(TaskBase))]
        private IEnumerable<TaskBase> Tasks;

        public TaskBase CreateTask(TaskEnumeration TaskType)
        {
            foreach(TaskBase t in Tasks)
            {
                if (t.Task == TaskType)
                {
                    switch (t.TaskType)
                    {
                        case Contracts.TaskType.Asynchrounous:
                            {
                                return t as AsynchronousTaskBase;
                            }
                        case Contracts.TaskType.Synchronous:
                            {
                                return t as TaskBase;
                            }
                    }
                }
            }
            throw new Exception("This task has not yet been implemented in the TaskFactory");
        } 
    }
}

The idea is that when we add new tasks we want them to be imported automatically into our application with the least amount of manual programming as possible. In this case we only need to add to the TaskEnumeration each time we add a new task.

We developed using Test Driven Development, this was used as a way to design the API. In the end our test looked like:

[TestMethod]
public void SynchronizeDirectory_UnsynchronizedFiles_TargetFilesSyncronized()
{
    // Arrange

    DestinationDir = "\\\\<SERVER>\\D$\\SOURCE\\PartnerRe.Deployment\\TestFiles\\Source";

    TaskDispatcher taskDispatcher = new TaskDispatcher();
    SynchronizeDirectoryTask synchronizeDirectoryTask = new SynchronizeDirectoryTask();
    synchronizeDirectoryTask.Parameters.DestinationPath = DestinationDir;
    synchronizeDirectoryTask.Parameters.SourcePath = SourceDir;

    taskDispatcher.AddTask<TaskBase>(synchronizeDirectoryTask);
    List<string> Targets = new List<string>();

    Targets.Add("CHZUPRELR886W9X");
    //Targets.Add("Server2");
    taskDispatcher.AddListOfTargets(Targets);

    // Act
    taskDispatcher.ExecuteTasks();

    //// Assert
    Assert.IsTrue(synchronizeDirectoryTask.ExecutionResults.Success);

}

As can be seen the test is not complete, but the method is sound.

In this case the experience I made with Paired programming showed it is an efficient method of programming. It has the advantage that it integrates at least 2 people directly into the decision making process of building and refactoring an architecture while sharing knowledge. The only time that I found paired programming not to be appropriate is when the programming method is unclear, for example when intensive googling is needed to resolve a small detail.