Thursday 24 December 2009

First steps in making a genric DAL with Entity Framework 4.0

I found an interesting blog about how to do this in http://elegantcode.com/2009/12/15/entity-framework-ef4-generic-repository-and-unit-of-work-prototype/

The first step is to add the Interface:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Linq.Expressions;
using System.Data.Objects;
using System.Data;

namespace DAL
{
    public interface IRepository2<T> : IDisposable where T : class
    {
            IQueryable<T> Fetch();
            IEnumerable<T> GetAll();
            IEnumerable<T> Find(Func<T, bool> predicate);
            T Single(Func<T, bool> predicate);
            T First(Func<T, bool> predicate);
            void Add(T entity);
            void Delete(T entity);
            void Attach(T entity);
            void SaveChanges();
            void SaveChanges(SaveOptions options);
    }
}

Next implementing the interface:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Linq.Expressions;
using System.Data.Objects;
using System.Data;

//http://elegantcode.com/2009/12/15/entity-framework-ef4-generic-repository-and-unit-of-work-prototype/
//
namespace DAL
{
/// <summary>
/// A generic repository for working with data in the database
/// </summary>
/// <typeparam name="T">A POCO that represents an Entity Framework entity</typeparam>

    public class DataRepository<T> : IRepository2<T> where T : class
    {

        /// <summary>
        /// The context object for the database
        /// </summary>
        private ObjectContext _context;

        /// <summary>
        /// The IObjectSet that represents the current entity.
        /// </summary>
        private IObjectSet<T> _objectSet;

        /// <summary>
        /// Initializes a new instance of the DataRepository class
        /// </summary>
        public DataRepository()
            : this(new RESEARCH_CMEntities())
        {

        }

        /// <summary>
        /// Initializes a new instance of the DataRepository class
        /// </summary>
        /// <param name="context">The Entity Framework ObjectContext</param>

        public DataRepository(ObjectContext context)
        {
            _context = context;
            _objectSet = _context.CreateObjectSet<T>();
        }

        /// <summary>
        /// Gets all records as an IQueryable
        /// </summary>
        /// <returns>An IQueryable object containing the results of the query</returns>

        public IQueryable<T> Fetch()
        {
            return _objectSet;
        }

        public IQueryable<T> GetQuery() { return _objectSet; }

        /// <summary>
        /// Gets all records as an IEnumberable
        /// </summary>
        /// <returns>An IEnumberable object containing the results of the query</returns>
        public IEnumerable<T> GetAll()
        {
            return GetQuery().AsEnumerable();
        }

        /// <summary>
        /// Finds a record with the specified criteria
        /// </summary>
        /// <param name="predicate">Criteria to match on</param>
        /// <returns>A collection containing the results of the query</returns>
        public IEnumerable<T> Find(Func<T, bool> predicate)
        {
            return _objectSet.Where<T>(predicate);
        }

        /// <summary>
        /// Gets a single record by the specified criteria (usually the unique identifier)
        /// </summary>
        /// <param name="predicate">Criteria to match on</param>
        /// <returns>A single record that matches the specified criteria</returns>
        public T Single(Func<T, bool> predicate)
        {
            return _objectSet.Single<T>(predicate);
        }

        /// <summary>
        /// The first record matching the specified criteria
        /// </summary>
        /// <param name="predicate">Criteria to match on</param>
        /// <returns>A single record containing the first record matching the specified criteria</returns>
        public T First(Func<T, bool> predicate)
        {
            return _objectSet.First<T>(predicate);
        }

        /// <summary>
        /// Deletes the specified entitiy
        /// </summary>
        /// <param name="entity">Entity to delete</param>
        /// <exception cref="ArgumentNullException"> if <paramref name="entity"/> is null</exception>
        public void Delete(T entity)
        {
            if (entity == null)
            {
                throw new ArgumentNullException("entity");
            }
            _objectSet.DeleteObject(entity);
        }

        /// <summary>
        /// Deletes records matching the specified criteria
        /// </summary>
        /// <param name="predicate">Criteria to match on</param>
        public void Delete(Func<T, bool> predicate)
        {
            IEnumerable<T> records = from x in _objectSet.Where<T>(predicate) select x;

            foreach (T record in records)
            {
                _objectSet.DeleteObject(record);
            }
        }

        /// <summary>
        /// Adds the specified entity
        /// </summary>
        /// <param name="entity">Entity to add</param>
        /// <exception cref="ArgumentNullException"> if <paramref name="entity"/> is null</exception>
        public void Add(T entity)
        {
            if (entity == null)
            {
                throw new ArgumentNullException("entity");
            }
            _objectSet.AddObject(entity);
        }

        /// <summary>
        /// Attaches the specified entity
        /// </summary>
        /// <param name="entity">Entity to attach</param>
        public void Attach(T entity)
        {
            _objectSet.Attach(entity);
        }

        /// <summary>
        /// Saves all context changes
        /// </summary>
        public void SaveChanges()
        {
            _context.SaveChanges();
        }

        /// <summary>
        /// Saves all context changes with the specified SaveOptions
        /// </summary>
        /// <param name="options">Options for saving the context</param>
        public void SaveChanges(SaveOptions options)
        {
            _context.SaveChanges(options);
        }

        /// <summary>
        /// Releases all resources used by the WarrantManagement.DataExtract.Dal.ReportDataBase
        /// </summary>
        public void Dispose()
        {
            Dispose(true);
            GC.SuppressFinalize(this);
        }

        /// <summary>
        /// Releases all resources used by the WarrantManagement.DataExtract.Dal.ReportDataBase
        /// </summary>
        /// <param name="disposing">A boolean value indicating whether or not to dispose managed resources</param>
        protected virtual void Dispose(bool disposing)
        {
            if (disposing)
            {
                if (_context != null)
                {
                    _context.Dispose();
                    _context = null;
                }
            }
        }
    }
}

I hear the sounds of Champaign glasses, so, Merry Christmas…

Monday 21 December 2009

Setting up a Partitioned database

Here is a script that shows how to setup a partitioned database. This can be useful to increase the performance of database by spreading the data across various drives for parallel i/o. In my case it was interesting to manage the life cycle of data by using a set of files corresponding to times in the year. The idea is to setup partitions for an arbitrarily long period. Then as data is no longer needed the corresponding filegroups can be dropped and the corresponding file archived away. There is a way to do this with rolling filegroups, the problem is that the file name no longer indicate to what period of time it belongs.

-- Usefull links:
-- http://sqlblog.com/blogs/sarah_henwood/archive/2007/07/23/validating-what-is-stored-in-a-partition-filegroup.aspx
-- http://msdn.microsoft.com/en-us/library/ms345146(SQL.90).aspx

USE [master]
GO

CREATE DATABASE [PartinionedDB] ON  PRIMARY
( NAME = N'PrimaryFileName', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\PartinionedDB.mdf' , SIZE = 6336KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
LOG ON
( NAME = N'PrimaryLogFileName', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\PartinionedDB_log.ldf' , SIZE = 10176KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
GO
-- Add filegroups that will contains partitioned values
ALTER DATABASE PartinionedDB ADD FILEGROUP fgBefore2008;
ALTER DATABASE PartinionedDB ADD FILEGROUP fg2008Quarter1;
ALTER DATABASE PartinionedDB ADD FILEGROUP fg2008Quarter2;
ALTER DATABASE PartinionedDB ADD FILEGROUP fg2008Quarter3;
ALTER DATABASE PartinionedDB ADD FILEGROUP fg2008Quarter4;
ALTER DATABASE PartinionedDB ADD FILEGROUP fg2009Quarter1;
ALTER DATABASE PartinionedDB ADD FILEGROUP fg2009Quarter2;
ALTER DATABASE PartinionedDB ADD FILEGROUP fg2009Quarter3;
ALTER DATABASE PartinionedDB ADD FILEGROUP fg2009Quarter4;
ALTER DATABASE PartinionedDB ADD FILEGROUP fgAfter2009;
-- Add files to filegroups
ALTER DATABASE PartinionedDB ADD FILE (NAME = 'fF08Q1', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\fF08Q1.ndf', SIZE = 2 MB ,FILEGROWTH = 2 MB ) TO FILEGROUP fg2008Quarter1;
ALTER DATABASE PartinionedDB ADD FILE (NAME = 'fF08Q2', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\fF08Q2.ndf', SIZE = 2 MB ,FILEGROWTH = 2 MB ) TO FILEGROUP fg2008Quarter2;
ALTER DATABASE PartinionedDB ADD FILE (NAME = 'fF08Q3', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\fF08Q3.ndf', SIZE = 2 MB ,FILEGROWTH = 2 MB ) TO FILEGROUP fg2008Quarter3;
ALTER DATABASE PartinionedDB ADD FILE (NAME = 'fF08Q4', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\fF08Q4.ndf', SIZE = 2 MB ,FILEGROWTH = 2 MB ) TO FILEGROUP fg2008Quarter4;
ALTER DATABASE PartinionedDB ADD FILE (NAME = 'fF09Q1', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\fF09Q1.ndf', SIZE = 2 MB ,FILEGROWTH = 2 MB ) TO FILEGROUP fg2009Quarter1;
ALTER DATABASE PartinionedDB ADD FILE (NAME = 'fF09Q2', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\fF09Q2.ndf', SIZE = 2 MB ,FILEGROWTH = 2 MB ) TO FILEGROUP fg2009Quarter2;
ALTER DATABASE PartinionedDB ADD FILE (NAME = 'fF09Q3', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\fF09Q3.ndf', SIZE = 2 MB ,FILEGROWTH = 2 MB ) TO FILEGROUP fg2009Quarter3;
ALTER DATABASE PartinionedDB ADD FILE (NAME = 'fF09Q4', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\fF09Q4.ndf', SIZE = 2 MB ,FILEGROWTH = 2 MB ) TO FILEGROUP fg2009Quarter4;
ALTER DATABASE PartinionedDB ADD FILE (NAME = 'fBefore08', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\fBefore08.ndf', SIZE = 2 MB ,FILEGROWTH = 2 MB ) TO FILEGROUP fgBefore2008;
ALTER DATABASE PartinionedDB ADD FILE (NAME = 'fAfter09', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\fAfter09.ndf', SIZE = 2 MB ,FILEGROWTH = 2 MB ) TO FILEGROUP fgAfter2009;

USE [PartinionedDB]
GO

-- Create Partition Function and Scheme
CREATE PARTITION FUNCTION pf_MyFyQuarters (DATETIME) AS
RANGE RIGHT FOR VALUES
(
'2008-01-01 00:00:00','2008-04-01 00:00:00','2008-07-01 00:00:00','2008-10-01 00:00:00', -- 2008
'2009-01-01 00:00:00','2009-04-01 00:00:00','2009-07-01 00:00:00','2009-10-01 00:00:00'); -- 2009
GO
CREATE PARTITION SCHEME ps_MyFyQuarters
AS PARTITION pf_MyFyQuarters
TO
(fgBefore2008,                                               -- partition 1
fg2008Quarter1,fg2008Quarter2,fg2008Quarter3,fg2008Quarter4, -- partition 2, 3, 4, 5
fg2009Quarter1,fg2009Quarter2,fg2009Quarter3,fg2009Quarter4, -- 6, 7, 8, 9
fgAfter2009);                                          -- partition 10
GO

-- Create a table that uses the partition function
CREATE TABLE PartitionTable
(
    Id UNIQUEIDENTIFIER,
    DateColumn DATETIME NOT NULL,
    ValueA INTEGER,
    ValueB VARCHAR(50)
)
ON ps_MyFyQuarters (DateColumn)
GO

-- Insert Data and check where it is stored
INSERT INTO PartitionTable(Id, DateColumn, ValueA, ValueB) VALUES('5D72B397-A066-43A4-B409-117DED69C41E', '09/30/2004', 10, 'AAA');
INSERT INTO PartitionTable(Id, DateColumn, ValueA, ValueB) VALUES('BA9403BE-A7B2-4209-807E-2F90F04A1B72', '09/30/2009', 20, 'BBB');
INSERT INTO PartitionTable(Id, DateColumn, ValueA, ValueB) VALUES('4A6283CB-4C63-4F6B-8D16-AFB4D36E4036', '09/30/2008', 30, 'CCC');
INSERT INTO PartitionTable(Id, DateColumn, ValueA, ValueB) VALUES('F268C4DA-7A88-49D2-B423-E15622D9BC00', '04/30/2009', 40, 'DDD');

SELECT * FROM PartitionTable
WHERE $PARTITION.pf_MyFyQuarters(DateColumn) = 1
SELECT * FROM PartitionTable
WHERE $PARTITION.pf_MyFyQuarters(DateColumn) = 5
SELECT * FROM PartitionTable
WHERE $PARTITION.pf_MyFyQuarters(DateColumn) = 6
SELECT * FROM PartitionTable
WHERE $PARTITION.pf_MyFyQuarters(DateColumn) = 7
SELECT * FROM PartitionTable
WHERE $PARTITION.pf_MyFyQuarters(DateColumn) = 8

Sunday 20 December 2009

Spilitting databases and maintaining foreign key relationships

Database are becoming less monolithic and sometimes it is convenient to split a large database into smaller ones. The idea is to abstract the data and to ease the deployment of large amounts of data. To make this work you need to be able to in-force foreign key relationships so as to preserve the integrity of the data. Care needs to be taken concerning queries that cross one database to another. The reason is that SQL Server has a sql compiler that learns about the content of the database and uses this to optimize the execution plan of queries. Provided there are no cross cutting queries who’s performance have been effected the following script is a good way of enforcing cross database foreign key relationships.

USE [master]
GO

CREATE DATABASE [Database] ON  PRIMARY
( NAME = N'PrimaryFileName', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\Database.mdf' , SIZE = 6336KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
LOG ON
( NAME = N'PrimaryLogFileName', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\Database_log.ldf' , SIZE = 10176KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
GO

CREATE DATABASE [DatabaseCopy] ON  PRIMARY
( NAME = N'PrimaryFileName', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\DatabaseCopy.mdf' , SIZE = 6336KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
LOG ON
( NAME = N'PrimaryLogFileName', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\DatabaseCopy_log.ldf' , SIZE = 10176KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
GO

-- CREATE DatabaseLink (Linked Server), only necessary if the databases are on different servers

USE [Database]
GO

CREATE TABLE MainTab
(
    Id UNIQUEIDENTIFIER PRIMARY KEY,
    ValueA INTEGER,
    ValueB VARCHAR(50)
);

USE [DatabaseCopy]
GO

CREATE TABLE MainTab
(
    Id UNIQUEIDENTIFIER PRIMARY KEY,
);

CREATE TABLE ChildTab
(
    Id UNIQUEIDENTIFIER PRIMARY KEY,
    Parent UNIQUEIDENTIFIER REFERENCES MainTab ON DELETE CASCADE,
    ValueA INTEGER
);

-- Create Triggers on the "Master" Database
USE [Database]
GO

CREATE TRIGGER [trg_InsertParentRec]
    ON  MainTab
    AFTER INSERT
AS
BEGIN
    /*Name of linked server shall be added here*/
    INSERT INTO [DatabaseCopy].[dbo].[MainTab](Id) (SELECT Id FROM inserted);
END
GO

CREATE TRIGGER [trg_DeleteParentRec]
    ON  MainTab
    AFTER DELETE
AS
BEGIN
    /*Name of linked server shall be added here*/
    DELETE FROM [DatabaseCopy].[dbo].[MainTab] WHERE Id  IN (SELECT deleted.Id FROM deleted);
END
GO

-- Insert into Master Database
USE [Database]
GO

INSERT INTO MainTab(Id, ValueA, ValueB) VALUES('5D72B397-A066-43A4-B409-117DED69C41E', 10, 'AAA');
INSERT INTO MainTab(Id, ValueA, ValueB) VALUES('BA9403BE-A7B2-4209-807E-2F90F04A1B72', 20, 'BBB');
INSERT INTO MainTab(Id, ValueA, ValueB) VALUES('4A6283CB-4C63-4F6B-8D16-AFB4D36E4036', 30, 'CCC');
INSERT INTO MainTab(Id, ValueA, ValueB) VALUES('F268C4DA-7A88-49D2-B423-E15622D9BC00', 40, 'DDD');

SELECT * FROM MainTab;

-- The Records in the MainTab should be automatically created by the trigger
-- Add some child records
USE [DatabaseCopy]
GO

SELECT * FROM MainTab;

INSERT INTO ChildTab(Id, Parent, ValueA) VALUES(NEWID(), '5D72B397-A066-43A4-B409-117DED69C41E', 100);
INSERT INTO ChildTab(Id, Parent, ValueA) VALUES(NEWID(), '5D72B397-A066-43A4-B409-117DED69C41E', 200);
INSERT INTO ChildTab(Id, Parent, ValueA) VALUES(NEWID(), 'BA9403BE-A7B2-4209-807E-2F90F04A1B72', 300);
INSERT INTO ChildTab(Id, Parent, ValueA) VALUES(NEWID(), 'BA9403BE-A7B2-4209-807E-2F90F04A1B72', 400);
INSERT INTO ChildTab(Id, Parent, ValueA) VALUES(NEWID(), '4A6283CB-4C63-4F6B-8D16-AFB4D36E4036', 500);
INSERT INTO ChildTab(Id, Parent, ValueA) VALUES(NEWID(), 'F268C4DA-7A88-49D2-B423-E15622D9BC00', 600);

SELECT * FROM ChildTab;

-- Delete a record from the Master Database
USE [Database]
GO

DELETE FROM MainTab WHERE Id = '5D72B397-A066-43A4-B409-117DED69C41E';

SELECT * FROM MainTab;

-- Also the Main and the child record should be deleted
USE [DatabaseCopy]
GO

SELECT * FROM MainTab;
SELECT * FROM ChildTab;

Thursday 17 December 2009

A Simple Database Naming Convention

In the early days of databases tooling was not able to interrogate the database. The result was that naming conventions where created that describe the functions of tables, views, fields and stored procedures. Today there are very good tools that allow these properties to be directly interrogated from the data access layer. Unfortunately these cryptic database naming conventions live on in many companies.

It is important that databases are built in a consistent way. Here is a brief database naming convention that allows a consistent design but without un necessary encryption.

1. Database Naming Convention

The following conventions are built from an extract of the naming conventions from Jason Mauss. The original can be found under the following link.

http://weblogs.asp.net/jamauss/articles/DatabaseNamingConventions.aspx

1.1 Tables

When naming your database tables, give consideration to other steps in the development process. Keep in mind you will most likely have to utilize the names you give your tables several times as part of other objects, for example, procedures, triggers or views may all contain references to the table name. You want to keep the name as simple and short as possible. Some systems enforce character limits on object names also. For example, in Oracle you are limited to about 30 characters per object.

Rules:

  • Plural Names - Table names should be plural, for example, "Customers" instead of "Customer". For table names with multiple words, only the last word should be plural, for example, "UserRoles" and "UserRoleSettings".
  • Prefixes - Used correctly, table prefixes can help you organize your tables into related groups or distinguish them from other unrelated tables. Used poorly, they can cause you to have to type a lot of unnecessary characters. We'll discuss what not to do first. Do not give your table names prefixes like "tbl" or "TBL_" as these are just redundant and unnecessary. Note that even for the prefix, use Pascal Case.
  • Notation - For all parts of the table name, including prefixes, use Pascal Case.
  • Special Characters
    • For table names, underscores should not be used. The underscore character has a place in other object names but, not for tables.
    • Do not use numbers in your table names either. This usually points to a poorly designed data model or irregularly partitioned tables.
    • Do not use spaces in your table names either.
    • If you are developing in a non-english language, do not use any of that language's special characters.
  • Abbreviations - Avoid using abbreviations if possible. Use "Accounts" instead of "Accts" and "Hours" instead of "Hrs". Do not use acronyms.
  • Junction a.k.a Intersection Tables - Junction tables, which handle many to many relationships, should be named by concatenating the names of the tables that have a one to many relationship with the junction table. For example, you might have "Doctors" and "Patients" tables. Since doctors can have many patients and patients can have many doctors (specialists) you need a table to hold the data for those relationships in a junction table. This table should be named DoctorsPatients". Since this convention can result in lengthy table names, abbreviations sometimes may be used at your discretion.

Example:

  • Employees

1.2 Columns

(incl. PRIMARY, FOREIGN, AND COMPOSITE KEYS) - When naming your columns, keep in mind that they are members of the table, so they do not need the any mention of the table name in the name. The primary key field is typically the only exception to this rule where including the table name is justified so that you can have a more descriptive field name than just "Id". "CustomerId" is acceptable but not required. Just like with naming tables, avoid using abbreviations, acronyms or special characters. All column names should use Pascal Case to distinguish them from SQL keywords (all upper case).

Rules:

  • Identity Primary Key Fields - For fields that are the primary key for a table and uniquely identify each record in the table, the name should simply be “Id“ since, that's what it is - an identification field. This name also maps more closely to a property name like “Id“ in your class libraries. Another benefit of this name is that for joins you will see something like
    "Customers JOIN Orders ON Customer.Id = Orders.CustomerId“
    which allows you to avoid the word “Customer“ again after the Customer table.
  • Foreign Key Fields - Foreign key fields should have the exact same name as they do in the parent table where the field is the primary key - with one exception - the table name should be specified. For example, in the Customers table the primary key field might be "Id". In an Orders table where the customer id is kept, it would be "CustomerId". There is one exception to this rule, which is when you have more than one foreign key field per table referencing the same primary key field in another table. In this situation, it might be helpful to add a descriptor before the field name. An example of this is if you had an Address table. You might have another table with foreign key fields like HomeAddressId, WorkAddressId, MailingAddressId, or ShippingAddressId.
  • Composite Keys - If you have tables with composite keys (more than one field makes up the unique value) then instead of just “Id“ you should use a descriptor before the “Id“ characters. Two fields like “ModuleId“ and “CodeId“ might make up the composite key, for example. If you don't see an “Id“ column in the table - you'll know that a composite key is used to uniquely identify records.
  • Prefixes - Do not prefix your fields with "fld_" or "Col_" as it should be obvious in SQL statements which items are columns (before or after the FROM clause). Including a two or three character data type prefix for the field is optional and not recommended, for example, "IntCustomerId" for a numeric type or "VcName" for a varchar type. However, these data type abbreviations are DBMS specific and are outside the scope of this document.
  • Data Type Specific Naming - Boolean fields should be given names like "IsDeleted", "HasPermission", or "IsValid" so that the meaning of the data in the field is not ambiguous. If the field holds date and/or time information, the word "Date" or "Time" should appear somewhere in the field name. It is sometimes appropriate to add the unit of time to the field name also, especially if the field holds data like whole numbers ("3" or "20"). Those fields should be named like "RuntimeHours" or "ScheduledMinutes".

1.3 Indexes

Since indexes are always related to a table or view, it makes the most sense to use the name of the table or view, as well as the column(s) they index, in the index name, along with some characters that specify the type of index it is. This naming convention also allows you, if looking at a list of indexes, to see the indexes ordered by table, then column, then index type.

Rules:

  • Naming Convention - The naming convention for indexes follows this structure:
    {TableName}{ColumnsIndexed}{U/N}{C/N}
    where "U/N" is for unique or non-unique and "C/N" is for clustered or non-clustered. This naming convention is unique among database objects, so adding characters to denote it being an index, like "idx" is not necessary. The naming convention alone is self-documenting and indentifies it as an index. For indexes that span multiple columns, concatenate the column names. "ProductsIdUC" indicates a unique, clustered index on the Id column of the Products table. OrderDetailsOrderIdCustomerIdNN" indicates a non-unique, non-clustered index on the OrderId and CustomerId columns in the OrderDetails table. Since this name is rather lengthy with both "OrderId" and "CustomerId" spelled out, they could be shortened to OrdId and CustId. However, notice that by using Pascal Case, thus not needing to use underscores, it is possible to keep the name of a complex index to about 30 characters.
  • Prefixes and Suffixes - Avoid putting a prefix like "idx" or "IDX_" before your indexes. This is not necessary due to the naming convention discussed in Rule 3a. A suffix of "_idx" or "IDX" is not necessary either for the same reason.

1.4 Constraints

Constraints are at the field/column level so the name of the field the constraint is on should be used in the name. The type of constraint (Check, Referential Integrity a.k.a Foreign Key, Primary Key, or Unique) should be noted also. Constraints are also unique to a particular table and field combination, so you should include the table name also to ensure unique constaint names across your set of database tables.

Rules:

  • Naming Convention - The naming convention syntax for constraints looks like this:
    {constraint type}{table name}_{field name}
    Examples:
    1. PkProducts_Id - primary key constraint on the Id field of the Products table
    2. FkOrders_ProductId - foreign key constraint on the ProductId field in the Orders table
    3. CkCustomers_AccountRepId - check constraint on the AccountRepId field in the Customers table
  • The reason underscores are used here with Pascal Case notation is so that the table name and field name are clearly separated. Without the underscore, it would become easy to get confused about where the table name stops and the field name starts.
  • Prefixes - A two letter prefix gets applied to the constraint name depending on the type
    Primary Key: Pk
    Foreign Key: Fk
    Check: Ck
    Unique: Un

1.5 Views

Views follow many of the same rules that apply to naming tables. There are only two differences. If your view combines entities with a join condition or where clause, be sure to combine the names of the entities that are joined in the name of your view.

  • Prefixes - While it is pointless to prefix tables, it can be helpful for views. Prefixing your views with "Vw" or "View" is a helpful reminder that you're dealing with a view, and not a table. Whatever type of prefix you choose to apply, use at least 2 letters and not just "V" because a prefix should use more more than one letter or its meaning can be ambiguous.
  • View Types - Some views are simply tabular representations of one or more tables with a filter applied or because of security procedures (users given permissions on views instead of the underlying table(s) in some cases). Some views are used to generate report data with more specific values in the WHERE clause. Naming your views should be different depending on the type or purpose of the view. For simple views that just join one or more tables with no selection criteria, combine the names of the tables joined. For example, joining the "Customers" and "StatesAndProvinces" table to create a view of Customers and their respective geographical data should be given a name like "VwCustomersStatesAndProvinces". For a view that is more like a report, a name like "VwDivisionSalesFor2004" might make more sense.

1.6 Stored Procedures

Unlike a lot of the other database objects discussed here, stored procedures are not logically tied to any table or column. Typically though, stored procedures perform one of the CRUD (Create, Read, Update, and Delete) operations on a table, or another action of some kind. Since stored procedures always perform some type of operation, it makes sense to use a name that describes the operation they perform. Use a verb to describe the type of operation, followed by the table(s) the operations occur on.

  • Prefixes or Suffixes - The way you name your stored procedures depends on how you want to group them within a listing. If you'd like to group them by the type of CRUD operation they perform, then prefix the name with "Create", "Get", "Update" or "Delete". Using this kind of prefix will, for example, group all of your "Create" procedures together since they will all start with the Create prefix, like "CreateProductInfo" or "CreateOrder". If instead, you would like to have your procedures ordered by the table they perform a CRUD operation on, adding "Create, Get, Update, or Delete" as a suffix will do that for you. For example, "ProductInfoCreate" or "OrdersCreate". If your procedure returns a scalar value, or performs an operation like validation, you should not use a CRUD prefix or suffix. Instead use the verb and noun combination. For example, "ValidateLogin"
  • Grouping Prefixes - If you have many stored procedures, you might want to consider using a grouping prefix that can be used to identify which parts of an application the stored procedure is used by. For example, a "Prl" prefix for Payroll related procedures or a "Hr" prefix for Human Resources related procedures can be helpful. This prefix would come before a CRUD prefix (See Rul 6a).
  • Bad Prefixes - Do not prefix your stored procedures with something that will cause the system to think it is a system procedure. For example, in SQL Server, if you start a procedure with "sp_", "xp_" or "dt_" it will cause SQL Server to check the master database for this procedure first, causing a performance hit. Spend a little time researching if any of the prefixes you are thinking of using are known by the system and avoid using them if they are.
1.6.1 Prefixes used in Project Datenmarkt
  • User defined stored procedure are prefixed by “usp_”.

1.7 Triggers

Triggers have many things in common with stored procedures. However, triggers are different than stored procedures in two important ways. First, triggers don't exist on their own. They are dependant upon a table. So it is wise to include the name of this table in the trigger name. Second, triggers can only execute when either an Insert, Update, or Delete happens on one or more of the records in the table. So it also makes sense to include the type of action that will cause the trigger to execute.

  • Prefixes and Suffixes - To distinguish triggers from other database objects, it is helpful to add "Trg" as a prefix or suffix. For example any of these combinations work: Trg_ProductsIns, ProductsInsTrg, Products_InsTrg, or InsProducts_Trg. As long as you include the table name, the operation that executes the trigger (Ins, Upd, or Del) and the "Trg" letters, it should be obvious to someone working with the database what kind of object it is. As with all conventions you use, pick one and remain consistent.
  • Multiple Operations - If a trigger handles more than one operation (both INSERT and UPDATE for example) then include both operation abbreviations in your name. For example, "Products_InsUpdTrg" or "TrgProducts_UpdDel"
  • Multiple Triggers - Some systems allow multiple triggers per operation per table. In this case, you should make sure the names of these triggers are easy to distinguish between. For example "Users_ValidateEmailAddress_InsTrg" and "Users_MakeActionEntries_InsTrg".

Wednesday 16 December 2009

NHibernate with Linq

We have been looking into replacing our data access layer with a loosely coupled approach. Here is a summary of what the resulting data access layer looks like.

Starting with 2 tables with a foreign key relationship. The first table has some wishes that have WishStatus given in the second class.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace PartnerRe.NCP.Data.Contracts
{
    public class Wish
    {
        public virtual int Id { get; set; }
        public virtual string Title { get; set; }
        public virtual string Description { get; set; }
        public virtual DateTime EntryDate { get; set; }
        public virtual WishStatus Status { get; set; }
    }

//CREATE TABLE [dbo].[DEV_WISH_LIST](
//    [WISH_LIST_ID] [int] IDENTITY(1,1) NOT NULL,
//    [TITLE] [varchar](max) NULL,
//    [DESCRIPTION] [varbinary](max) NULL,
//    [COMMENT] [varbinary](max) NULL,
//    [ENTRY_DATE] [datetime] NULL,
//    [STATUS_ID] [int] NULL,
}

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace PartnerRe.NCP.Data.Contracts
{
    public enum WishStatus
    {
        New,
        Approved,
        InProgress,
        Completed,
        Declined,
        Postponed
    }
}

Here is a mapping file that maps the fields in these classes to database table fields

Wish.hbm.xml
============

<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
    assembly="Entities"
    namespace="Entities">

    <class
        name="Wish"
        table="DEV_WISH_LIST"
        lazy="false">

        <id name="Id" column="WISH_LIST_ID">
            <generator class="native"/>
        </id>

        <property name="Title" column="TITLE" />
        <property name="Description" column="DESCRIPTION" />
        <property name="EntryDate" column="ENTRY_DATE" />
        <property name="Status" column="STATUS_ID" />

    </class>
</hibernate-mapping>

The database connection strings are defined in a config file

<?xml version="1.0"?>
<configuration>

  <configSections>
    <section name="hibernate-configuration" type="NHibernate.Cfg.ConfigurationSectionHandler, NHibernate" />
  </configSections>

  <hibernate-configuration xmlns="urn:nhibernate-configuration-2.2" >
    <session-factory>
      <property name="dialect">NHibernate.Dialect.MsSql2005Dialect</property>
      <property name="connection.provider">NHibernate.Connection.DriverConnectionProvider</property>
      <property name="connection.driver_class">NHibernate.Driver.SqlClientDriver</property>
      <property name="connection.connection_string">Data Source=localhost;Initial Catalog=RESEARCH_CM;Integrated Security=True</property>
      <property name="proxyfactory.factory_class">NHibernate.ByteCode.Castle.ProxyFactoryFactory, NHibernate.ByteCode.Castle</property>
      <property name="default_catalog">RESEARCH_CM</property>
      <property name="default_schema">dbo</property>
    </session-factory>
  </hibernate-configuration>

</configuration>

Next we make a Generic Data Access Layer. First we need to generate an interface. This interface enables us to replace NHibernate with any data access technology.

using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace PartnerRe.NCP.Data.Contracts
{
    /// <summary>
    /// Simple, general purpose, reusable data access layer (DAL) interface.
    /// </summary>
    public interface IDatabaseContext
    {
        /// <summary>
        /// Gets all entities of given type.
        /// </summary>
        /// <typeparam name="TEntity">The type of the entity.</typeparam>
        /// <returns>The list of entities.</returns>
        IList<TEntity> GetAll<TEntity>();

        /// <summary>
        /// Gets all entities of a given type ordered ascending or descending by the given property.
        /// </summary>
        /// <typeparam name="TEntity">The type of the entity.</typeparam>
        /// <param name="orderBy">The property the entities are ordered by.</param>
        /// <param name="ascending">if set to <c>true</c> [ascending].</param>
        /// <returns>The list of ordered entities.</returns>
        IList<TEntity> GetAll<TEntity>(string orderBy, bool ascending);

        /// <summary>
        /// Gets all entities according to the given simple filter.
        /// </summary>
        /// <typeparam name="TEntity">The type of the entity.</typeparam>
        /// <param name="propertyName">Name of the property.</param>
        /// <param name="value">The value of the property.</param>
        /// <returns></returns>
        IList<TEntity> GetFiltered<TEntity>(string propertyName, object value);

        /// <summary>
        /// Loads the entity with the given identifier.
        /// </summary>
        /// <typeparam name="TEntity">The type of the entity.</typeparam>
        /// <param name="dbId">The db id.</param>
        /// <returns>The loaded entitiy.</returns>
        TEntity GetById<TEntity>(object dbId);

        /// <summary>
        /// HACK: this interface should not be domain specific. Use query/sql string or criteria API, or LINQ !!!
        /// </summary>
        /// <typeparam name="TEntity"></typeparam>
        /// <param name="status"></param>
        /// <returns></returns>
        //IList<TEntity> GetByStatus<TEntity>(WishStatus status);

        //IList<Wish> GetWishByLinq();

        IQueryable<TEntity> GetLinq<TEntity>();

        /// <summary>
        /// Reattaches a detached entity to the current session.
        /// </summary>
        /// <param name="entity">The entity to reattach.</param>
        void Reattach(object entity);

        /// <summary>
        /// Detaches an attached entity from the current session. Removes it from the cache (evict).
        /// </summary>
        /// <param name="entity">The entity to detach.</param>
        void Detach(object entity);

        /// <summary>
        /// Inserts or updates the given entities to the database.
        /// </summary>
        /// <param name="entities">The entities to insert or update.</param>
        void Save(params object[] entities);

        /// <summary>
        /// Deletes the specified entities from the database.
        /// </summary>
        /// <param name="entities">The entities to delete.</param>
        void Delete(params object[] entities);

        /// <summary>
        /// Begins a new transaction.
        /// </summary>
        /// <returns>Returns a general transaction scope. Can be used in "using() { .Commit(); }" semantic. </returns>
        ITransactionScope BeginTransaction();

        /// <summary>
        /// Determines whether there is an active transaction.
        /// </summary>
        /// <returns>
        ///     <c>true</c> if transaction is active; otherwise, <c>false</c>.
        /// </returns>
        bool IsActiveTransaction();

        /// <summary>
        /// Commits the currently open transaction.
        /// </summary>
        void CommitTransaction();

        /// <summary>
        /// Rolls back the currently open transaction.
        /// </summary>
        void RollbackTransaction();
    }
}

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace PartnerRe.NCP.Data.Contracts
{
    /// <summary>
    /// Custom transaction scope to abstract System.TransactionScope, NHibernate TransactionScope, or other into
    /// independant transaction context.
    /// This allows the "using() { .Commit(); }" semantic.
    /// </summary>
    public interface ITransactionScope : IDisposable
    {
        /// <summary>
        /// Commits the transaction.
        /// </summary>
        void Commit();

        /// <summary>
        /// Rolls back the transaction.
        /// </summary>
        void Rollback();
    }
}

The Generic implementation of NHibernate with this interface looks like:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using NHibernate;
using NHibernate.Cfg;
using NHibernate.Criterion;
using NHibernate.Linq;
using System.Reflection;
using System.IO;
using PartnerRe.NCP.Data.Contracts;

namespace PartnerRe.NCP.Data.NHibernate
{
    /// <summary>
    /// NHibernate implementation of data access layer.
    /// This is an "expensive" object due to the containing NHibernate session factory!
    /// </summary>
    public class DatabaseContext : IDatabaseContext
    {
        /// <summary>
        /// The NHibernate SessionFactory. Static.
        /// </summary>
        protected static ISessionFactory SessionFactory { get; set; }

        /// <summary>
        /// NHibenrate configuration instance. Static.
        /// </summary>
        protected static Configuration Configuration { get; set; }

        /// <summary>
        /// Current NHibernate session.
        /// </summary>
        protected ISession Session;

        /// <summary>
        /// Current NHibernate transaction.
        /// </summary>
        protected ITransaction Transaction;

        /// <summary>
        /// Initializes a new instance of the <see cref="DatabaseContext"/> class.
        /// </summary>
        public DatabaseContext() : this(false)
        {
        }

        private void CriteriaApiWithNigel()
        {
            //ICriteria criteria = new EqPropertyExpression(fs.ENTRY_TYPE_COLUMN, xxx);

            //NHibernate.
            //this.Session.CreateCriteria<facebug.entities.Entry>().
        }

        /// <summary>
        /// Initializes a new instance of the <see cref="DatabaseContext"/> class.
        /// </summary>
        /// <param name="createSchema">if set to <c>true</c> creates the database schema (tables).</param>
        public DatabaseContext(bool createSchema)
        {
            if (Configuration == null)
            {
                throw new InvalidOperationException("must initialize expensive stuff (static factories) before creating instances");
            }

            // create session
            this.Session = this.CreateSession();
        }

        /// <summary>
        /// Initializes a new instance of the <see cref="DatabaseContext"/> class.
        /// </summary>
        /// <param name="session">The native NHibernate session.</param>
        public static void Initialize(string configFile, string assemblyLocation)
        {
            // configure NHibernate using given config file
            Configuration = new Configuration();
            Configuration.Configure(configFile);

            // look for NHibernate mappings
            if (!string.IsNullOrEmpty(assemblyLocation))
            {
                foreach (Assembly assembly in AssemblyReflectionHelper
                    .GetAssembliesWithAttribute<HbmContainerAssemblyAttribute>(
                        new DirectoryInfo(assemblyLocation),
                        "*"))
                {
                    Configuration.AddAssembly(assembly);
                }
            }
        }

        /// <summary>
        /// Creates a NHiberntae Session to work with.
        /// </summary>
        /// <returns>An open NHibernate session </returns>
        public ISession CreateSession()
        {
            return CreateSession(null);
        }

        /// <summary>
        /// Creates a NHiberntae Session to work with.
        /// </summary>
        /// <param name="interceptor">Interceptor connected to the session</param>
        /// <returns></returns>
        public ISession CreateSession(IInterceptor interceptor)
        {
            if (interceptor != null)
            {
                return GetSessionFactory().OpenSession(interceptor);
            }

            this.Session = GetSessionFactory().OpenSession();

            // make sure to be in the default catalog to make all the queries working.
            // There is an issue when creating tables:
            // JIRA Issue http://jira.nhibernate.org/browse/NH-1443

            return this.Session;
        }

        /// <summary>
        /// Gets the NHibernate session factory. Expensive object.
        /// </summary>
        /// <returns></returns>
        protected ISessionFactory GetSessionFactory()
        {
            if (SessionFactory == null)
            {
                SessionFactory = Configuration.BuildSessionFactory();
            }
            return SessionFactory;
        }

        #region IDatabaseContext Members

        /// <summary>
        /// Gets all entities of given type.
        /// </summary>
        /// <typeparam name="TEntity">The type of the entity.</typeparam>
        /// <returns>The list of entities.</returns>
        public IList<TEntity> GetAll<TEntity>()
        {
            return Session
                .CreateCriteria(typeof(TEntity))
                .List<TEntity>();
        }

        /// <summary>
        /// Gets all entities of a given type ordered ascending or descending by the given property.
        /// </summary>
        /// <typeparam name="TEntity">The type of the entity.</typeparam>
        /// <param name="orderBy">The property the entities are ordered by.</param>
        /// <param name="ascending">if set to <c>true</c> [ascending].</param>
        /// <returns>The list of ordered entities.</returns>
        public IList<TEntity> GetAll<TEntity>(string orderBy, bool ascending)
        {
            return Session
                .CreateCriteria(typeof(TEntity))
                .AddOrder(new Order(orderBy, ascending))
                .List<TEntity>();
        }

        /// <summary>
        /// Gets all entities according to the given simple filter.
        /// </summary>
        /// <typeparam name="TEntity">The type of the entity.</typeparam>
        /// <param name="propertyName">Name of the property.</param>
        /// <param name="value">The value of the property.</param>
        /// <returns></returns>
        public IList<TEntity> GetFiltered<TEntity>(string propertyName, object value)
        {
            return Session
                .CreateCriteria(typeof(TEntity))
                .Add(Expression.Eq(propertyName, value))
                .List<TEntity>();
        }

        /// <summary>
        /// HACK: this interface should not be domain specific. Use query/sql string or criteria API, or LINQ !!!
        /// </summary>
        /// <typeparam name="TEntity"></typeparam>
        /// <param name="status"></param>
        /// <returns></returns>
        //public IList<TEntity> GetByStatus<TEntity>(WishStatus status)
        //{
        //    return Session
        //        .CreateCriteria(typeof(TEntity))
        //        .Add(Expression.Eq("Status", WishStatus.Completed))
        //        .List<TEntity>();
        //}

        //public IList<Wish> GetWishByLinq()
        //{
        //    return this.Session
        //        .Linq<Wish>()
        //        .Where<Wish>((e) => e.Status == WishStatus.InProgress)
        //        .OrderBy<Wish, string>(w => w.Title)
        //        .ToList();
        //}

        public IQueryable<TEntity> GetLinq<TEntity>()
        {
            return this.Session.Linq<TEntity>();
        }

        /// <summary>
        /// Loads the entity with the given identifier.
        /// </summary>
        /// <typeparam name="TEntity">The type of the entity.</typeparam>
        /// <param name="dbId">The db id.</param>
        /// <returns>The loaded entitiy.</returns>
        public TEntity GetById<TEntity>(object dbId)
        {
            return Session.Get<TEntity>(dbId);
        }

        /// <summary>
        /// Reattaches a detached entity to the current session.
        /// </summary>
        /// <param name="entity">The entity to reattach.</param>
        public void Reattach(object entity)
        {
            Session.Lock(entity, LockMode.None);
        }

        /// <summary>
        /// Detaches an attached entity from the current session. Removes it from the cache (evict).
        /// </summary>
        /// <param name="entity">The entity to detach.</param>
        public void Detach(object entity)
        {
            if (Session.IsDirty())
            {
                throw new InvalidOperationException("Do not detach on dirty session. Commit first.");
            }
            Session.Evict(entity);
        }

        /// <summary>
        /// Inserts or updates the given entities to the database.
        /// </summary>
        /// <param name="entities">The entities to insert or update.</param>
        public void Save(params object[] entities)
        {
            if (entities.Length == 0) throw new ArgumentNullException("entities", "must pass entities to persist");

            foreach (object entity in entities)
            {
                Session.SaveOrUpdate(entity);
            }
        }

        /// <summary>
        /// Deletes the specified entities from the database.
        /// </summary>
        /// <param name="entities">The entities to delete.</param>
        public void Delete(params object[] entities)
        {
            if (entities.Length == 0) throw new ArgumentNullException("entities", "must pass entities to delete");

            foreach (object entity in entities)
            {
                Session.Delete(entity);
            }
        }

        /// <summary>
        /// Begins a new transaction.
        /// </summary>
        /// <returns>
        /// Returns a general transaction scope. Can be used in "using() { .Commit(); }" semantic.
        /// </returns>
        public ITransactionScope BeginTransaction()
        {
            if (this.Transaction != null && this.Transaction.IsActive)
            {
                throw new InvalidOperationException("nested transactions are not supported. Close open transaction before doing this");
            }

            this.Transaction = Session.BeginTransaction();

            return new TransactionScope(this.Transaction);
        }

        /// <summary>
        /// Commits the currently open transaction.
        /// </summary>
        public void CommitTransaction()
        {
            this.Transaction.Commit();
        }

        /// <summary>
        /// Rolls back the currently open transaction.
        /// </summary>
        public void RollbackTransaction()
        {
            this.Transaction.Rollback();
        }

        /// <summary>
        /// Determines whether there is an active transaction.
        /// </summary>
        /// <returns>
        ///     <c>true</c> if transaction is active; otherwise, <c>false</c>.
        /// </returns>
        public bool IsActiveTransaction()
        {
            if (this.Transaction != null && this.Transaction.IsActive)
            {
                return true;
            }

            return false;
        }

        #endregion
    }
}

Here is an example of how an extension may look like:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace PartnerRe.NCP.Data.NHibernate
{
    public static class DatabasePivotExtension
    {
        public static IQueryable<T> Pivot<T>(this IQueryable<T> q)
        {
            return q;
        }
    }
}

To use this in the business logic you first need to create a database session. This session can then be used in Linq queries. Linq builds up an object graph that is loosely coupled from the data access layer that then uses NHibernate as an ORM to persist data to the database. Here’s an example including the Linq Extention

            wishList = this.databaseContext
                .GetLinq<Wish>()
                .Where<Wish>((w) => w.Status == WishStatus.InProgress)
                .OrderBy<Wish, string>(w => w.Title)
                .Pivot<Wish>()
                .ToList();

8<---

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using DAL;
using Entities;

namespace WindowsFormsApplication1
{
    public partial class Form1 : Form
    {
        private IDatabaseContext databaseContext = new DatabaseContext();
        private IList<Wish> wishList;

        public Form1()
        {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            wishList = this.databaseContext.GetAll<Wish>();
            this.dataGridView1.DataSource = wishList;
        }

        private void button1_Click(object sender, EventArgs e)
        {
            int start = Environment.TickCount;

            //wishList = this.databaseContext.GetByStatus<Wish>(WishStatus.InProgress);
            //wishList = this.databaseContext.GetWishByLinq();
            wishList = this.databaseContext
                .GetLinq<Wish>()
                .Where<Wish>((w) => w.Status == WishStatus.InProgress)
                .OrderBy<Wish, string>(w => w.Title)
                .Pivot<Wish>()
                .ToList();

            MessageBox.Show((Environment.TickCount - start).ToString(), "Loading took");
            this.dataGridView1.DataSource = wishList;
        }
    }
}

This is a very powerful method of building up a data access layer. The ORM does have an overhead that needs to be tested. For bulk inserts other methods such as ADO.NET Bulkcopy may be more appropriate.

Tuesday 15 December 2009

Using Partial Classes to expose methods that are client, server or shared

Before the invention of shared classes this code would have been made by inheriting shared functionality from a base class. Here is another way of sharing functionality from a shared class:

This class if for shared methods for both client and server
namespace SharedPart
{
    public class Class1
    {
        public void SharedF1()
        {
        }
    }
}

This is an extension to the shared class for methods that are for clients only
using SharedPart;
namespace ClientPart
{
    public static class Class1Extension
    {
        public static void ClientF1(this Class1 c1)
        {
        }
    }
}

This is an extension to the shared class for methods that are for servers only
using SharedPart;
namespace ServerPart
{
    public static class Class1Extension
    {
        public static void ServerF1(this Class1 c1)
        {
        }
    }
}

ion

Here is how to consume the client and shared parts
using SharedPart;
using ClientPart; // By commenting these out restricts the functionality visible in Class1namespace Client
{
    static class Program
    {
        static void Main()
        {
            Class1 c = new Class1();
            c.SharedF1();
            c.ClientF1();

        }
    }
}

Using partial classes in this way is not always the best approach because it does not lend itself to implementation in interfaces. Often the inherited way is better because the resulting code is more structured. But this is an interesting method to extend functionality of an existing class.

Sunday 13 December 2009

Deprecate Compiler Option Og

We made some interesting discoveries concerning deprecated compiler options.

Between Visual C++ 2003 and Visual C++ 2005 Microsoft made a number of changes. One of which was a range-check to the vector operator[]. This means that every call to operator[] includes a check that the argument is within the range of the vector, which slows things down. But there are ways to remove this check. The second change was that Og optimizer flag was deprecated. This flag made the MS C++ compiled code almost as fast as our Intel C++ code. Fortunately Visual Studio 2010 still supports this flag.

I am in contact with Microsoft to determine what will replace Og.

Saturday 5 December 2009

My first steps in the world of using MPI in Parallel Numeric Algorithms in .Net applications

MPICH2 is a de facto industry standard for parallel numeric algorithms. Although it has been around for about 20 years and has a very strong following in the scientific community the average Software Engineer has not heard of this. This is a shame because although the programming model is a little low level it is very powerful and provides a great way to make use of multi core or even distributed grid computing. There are implementations of MPI for a wide variety of languages ranging from R to Fortran.

I have looked into an implementation of MPI that works with .NET from Open System Labs, Pervasive Technology Labs at Indiana University http://www.osl.iu.edu/research/mpi.net/

Software Development Kit for MPI.NET needs Windows Compute Cluster Server and can run on Windows HPC Server 2008, Windows XP, or Windows Vista. Then there is a command line that invokes an MPI program within the framework of the Compute Cluster Server. Here is an example from the sdk:

C:\Program Files\MPI.NET>"C:\Program Files\Microsoft Compute Cluster Pack\Bin\mp
iexec.exe" -n 8 PingPong.exe
Rank 0 is alive and running on chzuprel227.global.partnerre.net
Pinging process with rank 1... Pong!
  Rank 1 is alive and running on chzuprel227.global.partnerre.net
Pinging process with rank 2... Pong!
  Rank 2 is alive and running on chzuprel227.global.partnerre.net
Pinging process with rank 3... Pong!
  Rank 3 is alive and running on chzuprel227.global.partnerre.net
Pinging process with rank 4... Pong!
  Rank 4 is alive and running on chzuprel227.global.partnerre.net
Pinging process with rank 5... Pong!
  Rank 5 is alive and running on chzuprel227.global.partnerre.net
Pinging process with rank 6... Pong!
  Rank 6 is alive and running on chzuprel227.global.partnerre.net
Pinging process with rank 7... Pong!
  Rank 7 is alive and running on chzuprel227.global.partnerre.net

I carried out a test that proved that all cores of my laptop where being used. There may be a way to call this programatically via C:\Program Files\Microsoft Compute Cluster Pack\Bin\ccpapi.dll. In have seen a .NET based and a SOA based API for this within the HPC Server

There are one or two things you must not do. For example

             //Console.SetWindowSize(60, 30);

produced this error message:

Unhandled Exception: System.ArgumentOutOfRangeException: The console buffer size must not be less than the current size and position of the console window, norgreater than or equal to Int16.MaxValue.
Parameter name: height
Actual value was 30.
   at System.Console.SetBufferSize(Int32 width, Int32 height)
   at EQ.Japan.Program.Main(String[] args) in C:\Data\Work\EXposure+IT\RefactoringAndOprimization\CodePerformanceAnalysis\BenchmarkCode\VB_MPI\ConsoleApplication1\Program.cs:line 26

            //Console.SetWindowSize(60, 30);

I refactored some code to include

            private static void MyCalculation(int fromInclusive, int toExclusive, ref double[] loss)

This is called in the following way:

                int fromInclusive=1;
                int toExclusive = 194510;
                int MaxIntervals = 194510;
                int StepSize = MaxIntervals / Communicator.world.Size;
                int StepCount  = MaxIntervals / StepSize;
                double[] loss = new double[0x2f7cf];
                Console.WriteLine("StepCount = {0} " , StepCount);

                for (int z = 0; z < StepCount -1 ; z++)
                {
                    if (Communicator.world.Rank.Equals(z))
                    {
                        fromInclusive = 1 + z * StepSize;
                        toExclusive = 1 + (z+1) * StepSize;
                        MyCalculation(fromInclusive, toExclusive,ref loss);
                    }
                }
                if (Communicator.world.Rank.Equals(StepCount-1))
                {
                    fromInclusive = 1 + (StepCount-1) * StepSize;
                    toExclusive = MaxIntervals;
                    MyCalculation(fromInclusive, toExclusive, ref loss);
                }

                if (Communicator.world.Rank.Equals(0))
                {
                }

This worked well but I am having difficulties in gathering the loss[].

On the MPI.NET web site there are a number of examples how such aggregations can be made. Here is one of the examples that demonstrates the power MPI has for tightly coupled calculations

using System;
using MPI;

class Pi
{
    static void Main(string[] args)
    {
        int dartsPerProcessor = 10000;
        using (new MPI.Environment(ref args))
        {
            if (args.Length > 0)
                dartsPerProcessor = Convert.ToInt32(args[0]);
            Intracommunicator world = Communicator.world;                                        // <<<<<<<
            Random random = new Random(5 * world.Rank);
            int dartsInCircle = 0;
            for (int i = 0; i < dartsPerProcessor; ++i)
            {
                double x = (random.NextDouble() - 0.5) * 2;
                double y = (random.NextDouble() - 0.5) * 2;
                if (x * x + y * y <= 1.0)
                    ++dartsInCircle;
            }

            if (world.Rank == 0)
            {
                int totalDartsInCircle = world.Reduce<int>(dartsInCircle, Operation<int>.Add, 0);  // <<<<<<<<
                System.Console.WriteLine("Pi is approximately {0:F15}.",
                    4*(double)totalDartsInCircle/(world.Size*(double)dartsPerProcessor));
            }
            else
            {
                world.Reduce<int>(dartsInCircle, Operation<int>.Add, 0);                           // <<<<<<<<

            }
        }
    }
}

My next Attempt looked like

                for (int z = 0; z < StepCount -1 ; z++)
                {
                    if (Communicator.world.Rank.Equals(z))
                    {
                        fromInclusive = 1 + z * StepSize;
                        toExclusive = 1 + (z+1) * StepSize;
                        MyCalculation(fromInclusive, toExclusive,ref loss);
                    }
                }
                if (Communicator.world.Rank.Equals(StepCount-1))
                {
                    fromInclusive = 1 + (StepCount-1) * StepSize;
                    toExclusive = MaxIntervals;
                    MyCalculation(fromInclusive, toExclusive, ref loss);

                }

                Intracommunicator world = Communicator.world;
                world.Send<Double[]>(loss, 0,0);

                if (world.Rank == 0)
                {
                    System.Diagnostics.Debugger.Launch();

                    for (int z = 0; z < StepCount - 1; z++)
                    {
                        Double[] test = world.Receive<Double[]>(1, 0);
                    }
                }

But this produced the following error

System.Exception was unhandled
  Message="Other MPI error, error stack:\nMPI_Send(172): MPI_Send(buf=0x0442EB1C, count=1, dtype=USER<struct>, dest=0, tag=0, MPI_COMM_WORLD) failed\nMPID_Send(51): DEADLOCK: attempting to send a message to the local process without a prior matching receive"
  Source="MPI"
  StackTrace:
       at MPI.Communicator.Send[T](T value, Int32 dest, Int32 tag)
       at MPI.Communicator.Send[T](T value, Int32 dest, Int32 tag)
       at EQ.Japan.Program.Main(String[] args) in C:\Data\Work\EXposure+IT\RefactoringAndOprimization\CodePerformanceAnalysis\BenchmarkCode\VB_MPI\ConsoleApplication1\Program.cs:line 120
       at System.AppDomain._nExecuteAssembly(Assembly assembly, String[] args)
       at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
       at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
       at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
       at System.Threading.ThreadHelper.ThreadStart()
  InnerException:

The mistake seems to be that a process must be allocated to gathering data. If this is not ready to recieve the data then this error happens

So my next Attempt Testing just how to send and receive data without the check about the number of processes. The example below works. We need at least 2 processes. 2 Processes means that we are running in serial, only after 3 processes we start going parallel...

                int fromInclusive = 1;
                int toExclusive = 194510;
                int MaxIntervals = 194510;
                int StepSize = MaxIntervals / Communicator.world.Size;
                int StepCount = MaxIntervals / StepSize;
                double[] loss = new double[0x2f7cf];
                    if (Communicator.world.Rank.Equals(1))
                    {
                        fromInclusive = 1;
                        toExclusive = 194510;
                        MyCalculation(fromInclusive, toExclusive, ref loss);
                    }
                Intracommunicator world = Communicator.world;

                if (world.Rank == 0)
                {

                    Double[] test = world.Receive<Double[]>(1, 0);
                    System.Diagnostics.Debugger.Launch();

                }
                else
                {
                    world.Send<Double[]>(loss, 0, 0);
                }

These are my first experiments with MPI. As I mentioned earlier MPI has been around a long time and there are communities out there that have solved these kinds of problems a long time ago. So my next steps will be to look at the tutorials on the MPI.NET web site and on http://math.acadiau.ca/ACMMaC/Rmpi/.

In addition to this I will be looking into the new parallel programming models that will be available with .Net 4.0. These are focused around raising the level of abstraction such that the programmer does not need to worry about threads and locks. Instead there is a programming model where these details have been abstracted away.