Monday, November 17, 2008

Distributed Transactions in .NET 3.5

My on-going notes on the topic…

Modern SOA Coordination, Transactions, Business Activities, Orchestration and Choreography

Modern SOA solutions are composed of several services, sometimes structured in layers from the most elementary services to complex business services.

There are cases where the technologies and frameworks used to build them are different. This is a major benefit but also complexity that demands technology to manage it.

One important field of study is how services exchange messages. This field brought some patterns know as Message Exchange Patterns.

Message Exchange Patterns – MEPs

The most basic exchange patterns are Request/Response, Fire-And-Forget and Solicit-Response.

The Request/Response is the basic pattern where a consumer emits a request message to a provider and receives the response back with the result of its request.

The Fire-And-Forget pattern is used when a consumer doesn’t require or care about the response of an operation. It emits the message and goes one with its life.

The Solicit-Response is the inverse of the Request/Response pattern.

Complex MEPs

By using groups of the previous patterns complex MEPs are created. One complex MEP that is popular is the Publish-Subscribe MEP. It this pattern a party contacts another one requesting it to be notified when a given event (also known as topic) happens. The second party, when the event happens, will go through its list of subscribers and publish a notification to them.

Service Activities and Coordination

A business process is composed by multiple steps in multiple services. A service activity is any service interaction required to complete business tasks.

In a business process the order of activities is important; there are constraints limiting when an activity can be initiated or concluded. This introduces contextual information in the runtime environment so that it can keep track of the process state.

WS-Coordination is WS-* standard describing a protocol to introduce and manage this contextual information. It is based on the coordinator service model:

  • Activation Service - Creates contexts and associates them to activities.
  • Registration Service - Where participant services register to use contextual information from a certain activity and a supported protocol.
  • Coordinator - The controller service that manages the composition.
  • Protocol-Specific-Services - WS-Coordination is a building block for other protocols like WS-AtomicTransactions. These protocols require specific services for managing details not covered on the WS-Coordination standard.

WS-AtomicTransaction

WS-AtomicTransaction is a coordination type, an extension to use with the WS-Coordination context management framework.

A service participates in an atomic transaction by first receiving a coordination context from the activation service, after that it is allowed to register for the available transaction protocols.

The primary transaction protocols are:

  • Completion protocol to initiate the commit or abort states.
  • Durable2PC protocol for services representing permanent data repositories.
  • Volatitle2PC protocol for services representing volatile data repositories.

An atomic transaction should be as short as possible in terms of duration. For the time it lasts there will be resources locked and concurrent requests will have to wait. Naturally the scalability of the application is greatly influenced by this.

Two Phase Commit Protocol Basic algorithm

Commit-request phase

1. The coordinator sends a query to commit message to all transaction participants and waits until it has received a reply from all of them.

2. Each participant executes the transaction up to the point where it has to decide to commit or abort.

3. It replies with an agreement message (votes Yes to commit), if the transaction succeeded, or an abort message (No, not to commit), if the transaction failed.

Commit phase

Success

If the coordinator received an agreement message from all participants during the commit-request phase:

1. The coordinator sends a commit message to all the cohorts. 2. Each cohort completes the operation, and releases all the locks and resources held during the transaction. 3. Each cohort sends an acknowledgment to the coordinator. 4. The coordinator completes the transaction when acknowledgments have been received.

Failure

If any cohort sent an abort message during the commit-request phase:

1. The coordinator sends a rollback message to all the cohorts. 2. Each cohort undoes the transaction using the undo log, and releases the resources and locks held during the transaction. 3. Each cohort sends an acknowledgement to the coordinator. 4. The coordinator completes the transaction when acknowledgements have been received.

Business Activities

Business Activities manage long-running service activities. They do not support rolling back operations and are different from atomic transactions in the way they deal with error. It is not possible to hold locks on data to ensure ACID on these interaction patterns.

Business Activities deal with concurrency and errors by providing alternative business logic to reverse previously made changes to the system's state.

WS-BusinessActivity is the WS-* protocol for these interaction patterns.

On top of this Orchestration allows business logic to be expressed in a standardized way using services. This is the role of WS-BPEL but is out of the scope of this talk.

Distributed Transactions in the WCF way

WCF is able to propagate transactions across the service boundary. This feature is known as transaction flow.

Transaction flow must be enabled at the binding in both communication sides to work.

<bindings>
<netTcpBinding>
  <binding name="netTcpWithTransactions" transactionFlow="true" />
</netTcpBinding>
</bindings>

Distributed transactions do not require reliability in the transport but enabling it reduces the number of transactions aborted by timeout (caused by lost messages).

<bindings>
  <netTcpBinding>
      <binding name="netTcpWithTransactions" transactionFlow="true" >
          <reliableSession enabled="true" />
      </binding>
  </netTcpBinding>
</bindings&gt;

The transaction flow is configured per service operation with the TransactionFlow attribute:

· Allowed – The operation will accept incoming transactions.

· NotAllowed – The operation will not accept incoming transactions.

· Mandatory – The operation will only work if there is an incoming transaction.

Transaction flow is not allowed for one way calls (the client would not be able to abort the transaction).

Supported Transaction Protocols

WCF Supports the following list of transaction protocols:

· Lightweight – Used inside the same AppDomain.

· OleTx – Used to propagate across AppDomain, process boundaries and machine boundaries. It uses RPC calls in a format that is Windows specific. When crossing over the internet it can cause problems because it uses ports that typically closed.

· WS-AtomicTransactions – Use to propagate across AppDomain, process boundaries and machine boundaries. Unlike OleTx it can cross the internet because it is HTTP based, supported by SOAP extensions.

The bindings that support transactions are designed to switch to the “best” (lighter) protocol depending on the operation conditions.

Transaction Managers

Associated with each transaction protocol and with a resource kind there is a transaction manager:

· LTM – Lightweight transaction manager manages transactions inside a single AppDomain and when there is only one opened connection in the transaction. If two connections are opened in the same transaction and AppDomain DTC is used. In SQL Server 2008 this is not true and LTM can be used with multiple opened connections in the same AppDomain.

· KTM – Is specific to Vista and manages kernel resources that support transactions.

· DTC – Distributed Transaction Coordinator. Manages both OleTx and WS-AT transactions.

resourcesXtransactionmanagers

WCF assigns the appropriate transaction manager; it starts at the lightest possible. When new resource managers enlist in the transaction, WCF can promote the transaction to a next level manager. Once promote there is no going back, it will run elevated until abort or commit.

Ambient Transaction

The ambient transaction is the transaction in which the current code executes. It is available in the static property Transaction.Current. It is stored per thread.

Local Transaction and Distributed Transaction

The Transaction object is used both for distributed and local transactions. There are two identifiers available in the Transacton object. LocalIdentifier and DistributedIdentifier. The local is always assign, but the DistributedIdentifier is created when the TransactionManager is promoted to a DTC Transaction Manager.

Transactional Service Development

As mentioned by the book Programming WCF Services. WCF provides both explicit and implicit transaction programming modes. The explicit mode is used when the transactional objects are created explicitly in the code. The implicit mode is used when the code is marked with special attributes.

When TransactionScopeRequired property of the the OperationBeahavior attribute is marked as true a transaction object is made available; either by using a transaction that is flowing through the execution chain or by providing a new one.

What will actually happen depends on the way the Transaction Flow is configured. The following picture summarizes the available options:

TransactionModes

  • In the Client/Service mode the service will use the client transaction if possible. When it is not available it will create a service side transaction.
  • In the Client mode the service only uses the client transaction.
  • In the Service mode the service always has a transaction and it must differ from any transaction the client may or may not have.
  • In the None mode the service never has a transaction.

WCF manages almost every aspect of transactions except for the fact that it does not know if it should abort or commit. For that intervenient parties most vote to either abort or commit.

The voting can be configured declaratively with the TransactionAutoComplete property in the OperationBehavior attribute. In this case WCF will vote commit if there are not errors (exceptions) in the operation.

The other option is explicit voting. In this case the operation must call the SetTransactionComplete method in the Operation Context. It must do so if there are no errors and it must do it only once. A second call would raise an InvalidOperationException.

Isolation Modes (enumeration in System.Transactions)

· Unspecified

· ReadUncommited

· ReadCommited

· RepeatableRead

· Serializable

· Chaos

· Snapshot

A short summary of the main concurrency effects

  • Lost Updates - When different operations select the same row to update based on the value originally selected.
  • Dirty Read - Actions dependent on a certain row can follow wrong paths based on values that have not been committed. The data can be modified before being committed leading the system to a state that violates business rules.
  • Non-repeatable read - Several reads to the same row contain different values because those are being modified by other transactions.
  • Phantom reads -Reading a set of rows contains rows that will be deleted on commit. Those rows will not come up again.

How to analyze what locks are in place at a given instant?

Use windows performance counters.

Use sql profiler.

Query sys.dm_tran_locks.

Use the EnumLocks API.

How to discover long running transactions

Query the system table sys.dm_tran_database_transactions.

Important rules to minimize deadlocks

  • Access objects in the same order.
  • Avoid user interaction in transactions.
  • Keep transactions short and in one batch.
  • Use a lower isolation level.
  • Use a row versioning-based isolation level.
  • Set READ_COMMITTED_SNAPSHOT database option ON to enable read-committed transactions to use row versioning.
  • Use snapshot isolation.
  • Use bound connections.

Enable snapshot and row versioning

Read committed isolation using row versioning is enabled by setting the READ_COMMITTED_SNAPSHOT database option ON. Snapshot isolation is enabled by setting the ALLOW_SNAPSHOT_ISOLATION database option on. When either option is enabled for a database, the Database Engine maintains versions of each row that is modified. Whenever a transaction modifies a row, image of the row before modification is copied into a page in the version store.

Distributed Transactions in .NET 3.5 - Demo

It was a long October working on a hot release :). Finally got the time to get back to the transaction series. The demo uses a very simple system metaphor where services are decomposed in:

  • Data Contracts to transport business entity data across the wire in an optimized fashion.
  • Message Contracts to represent the services request and response and to allow service operation changes without breaking contracts.
  • Service Contracts to represent the service operations.
  • Fault Contracts to represent errors in the services (not explored in this demo).
  • Services Web Site.
  • Entity Framework Domain Model.
  • ASP.NET Web Pages for data consulting.

Transactions are enabled at the WS HTTP binding:

<wsHttpBinding>
    <binding name="wsHttp" transactionFlow="true" />
</wsHttpBinding>
The Isolation Level is set to ReadCommited:
[ServiceBehavior(TransactionIsolationLevel = IsolationLevel.ReadCommitted)]
public class CustomersService : ICustomersService
{
...
}
Transactional Operations use the TransactionScope attribute to manage transactions:
[OperationBehavior(TransactionScopeRequired = true)]
public CreateCustomerResponse CreateCustomer(CreateCustomerRequest request)
{
...
}
And the Entity Framework integrates just fine with WCF distributed transactions:
using (DemosEntities entities = new DemosEntities())
{
    // create the entity

    Customer customer = new Customer();

    // translate the data contract to the entity
    
    customer.Name = request.Customer.Name;
    customer.Id = Guid.NewGuid();

    // add it to the set

    entities.AddToCustomerSet(customer);

    // "commit" changes

    entities.SaveChanges();
}

Distributed Transactions Demo v1

Sunday, September 28, 2008

Distributed Transactions versus Compensation – Part 1 the planning

This week is a very important week for me. I have my first presentation at GASP. A group of architects from Portugal. I’m really excited about it and working real hard to leave a good impression :).

I’m not really an expert on transactions and not that good at database planning and designing. So this plays a really big challenge on me. Which is really great!

I’ve seen a couple of presentations advocating that compensation is a much better approach to dealing with concurrency in SOA than distributed transactions. It is really easy to visualize why, locks across machine boundaries and maybe even organization boundaries smell bad even if I don’t try it. It would be really great if you could base your opinion’s on how much an idea makes you uncomfortable!

In the real world people prefer numbers to emotions when it comes to technology or science. It might be the case that some ISV is trying to sell us a couple of products that will make our life much easier. Maybe it is not really that bad. Maybe we can pay the performance price by gaining in the development costs.

So what I am working on is on a way to measure how much is a compensation based solution better than a distributed transaction solution. One of this days I’m going to put a large enough team developing business code based on this difference and I better get a clear picture of the benefits because this will really hurt $ the company I work for :).

I’ve been working in IT for some years and this is the first time I can remember that I am actually using something coming from college. I’m using science methods and planning and experiment to get some data out. Remember? Built an hypotheses, plan an experiment, collect date, compare it against expect theoretical values and extract conclusions. This is the closest I can get from the excitement of accelerating particles at 7 TeV at LHC :).

I have divided my experiment in several areas:

First I need a data model that exposes the problem of concurrent accesses and a service oriented solution that manages that data model.

If the data model is too small it will affect the experiment because SQL optimizations will work better. The smaller each data row is the better they fit in a single page in SQL.

Than I need a deployment model for that SOA solution:

    • Single machine.
    • Intra Network.
    • Internet

I need hardware to conduct the experiment. And at this moment I am stuck with this problem. I can only get two laptops… The IO capability of SQL server will be greatly influenced by this in the experiment. The goods news is that it will affect evenly (because the hardware is similar) all the services and therefore the numbers will be scaled down compared to real servers (hopefully).

Then I need a way to simulate simultaneous concurrent accesses from a service consumer perspective. This is not easy when you have limits on hardware. Real world solutions are accessed by millions of different client machines and there is no way you can simulate that with threads. You have hardware and software constraints (such as number of connections per socket). But there is not much to think there. The only way I can think of is using different threads.

After this I need to think on the Isolation levels I want to test. I have decided to limit these to:

  • Read Uncommitted.
  • Read Committed.
  • Serializable.
  • Snapshot.

Finally I need to decide how I will collect and organize data to extract conclusions. I will have a single table that will record each operation time. I will divide testing into phases so that I can correlate different types of concurrent accesses. For instance there will be a phase where only a master/detail table relation will be accessed. There will be times where all possible scenarios will be accessed simultaneously. With this I can extract different averages and compare then. Enterprise library can really help here.

In the next posts I will post my data models, data collection models and source code for the experiment. Stay close.

Monday, September 8, 2008

Dependency Containers and Design by Contract

Design by contract used to bring an idea of complexity to me before I could really understand the idea and get my hands dirty on it.

The main idea I try to follow is that major system components should be represented by what they do and not by what they are. Let’s pick up a simple idea, a collection. A collection is something that is able to return an enumerator to transverse its elements, it can add an element to it and it can return how many elements it is holding. These are some of the things a collection is able to do.

A collection can be a list, it can be a tree, it can be a stack, as long as it can behave as a collection it is a collection. If you search for Duck Typing you will get a lot of funny stories about this idea.

When possible we should encourage replacing concrete objects by their abstractions because the code written against abstractions obviously promotes more reusability.

In most cases designing by contract is writing the interface first and implementing the classes that realize then after. This was what I did on this really dummy invoicing system:

#region contracts

   /// <summary>
   /// Manages aspects related with user authorization.
   /// </summary>
   public interface IAuthorizationManager
   {
       bool Authenticate(string user, string password);
   }

   /// <summary>
   /// Traces who did what, when.
   /// </summary>
   public interface IAuditingManager
   {
       bool WriteAuditInformation(string operation, string who, DateTime when, object what);
   }

   /// <summary>
   /// Manages invoicing operations.
   /// </summary>
   public interface IInvoicingManager
   {
       void CreateInvoice(string salesPerson, string customerCode, string[] products, decimal[] unitPrices, decimal[] quantities);
   }

#endregion
Then I created a concrete class that uses this components ( I could have been a purist and write a contract for it :) ):
public class SalesSystem
{
   public IAuditingManager AuditingSystem
   {
       get;
       private set;
   }

   public IAuthorizationManager AuthorizationSystem
   {
       get;
       private set;
   }

   public IInvoicingManager InvoicingSystem
   {
       get;
       private set;
   }

   public SalesSystem(IAuditingManager auditingSystem, IAuthorizationManager authorizationSystem, IInvoicingManager invocingSystem)
   {
       this.AuditingSystem = auditingSystem;
       this.AuthorizationSystem = authorizationSystem;
       this.InvoicingSystem = invocingSystem;
   }
}
Note how I pass in all the objects in the constructor. It is good to read my post on the "Law of Demeter”.
These are my really dummy realizations of these classes (do note the really bullet proof the security code :)!)
#region really dummy realizations

internal class MyAuthorizationManager : IAuthorizationManager
{
    #region IAuthorizationManager Members

    public bool Authenticate(string user, string password)
    {
        if (string.Compare(user, "user1", StringComparison.OrdinalIgnoreCase)==0)
        {
            if (string.Compare(password, "123", StringComparison.Ordinal) == 0)
            {
                return true;
            }
        }

        return false;
    }

    #endregion
}

internal class MyAuditingManager : IAuditingManager
{
    #region IAuditingManager Members

    public bool WriteAuditInformation(string operation, string who,DateTime when, object what)
    {
        Console.WriteLine("{0} by {1} at {2} over {3}", operation, who, when, what.ToString());
        return true;
    }

    #endregion
}

internal class MyInvoicingManager : IInvoicingManager
{
    public MyInvoicingManager(IAuditingManager auditing)
    {
        this.Auditing = auditing;
    }

    public IAuditingManager Auditing
    {
        get;
        private set;
    }

    #region IInvoicingManager Members

    public void CreateInvoice(string salesPerson, string customerCode, string[] products, decimal[] unitPrices, decimal[] quantities)
    {
        Console.WriteLine("Invoice created for {0}", customerCode);
        this.Auditing.WriteAuditInformation("CreateInvoice", salesPerson, DateTime.Now, "Invoice");
    }

    #endregion
}

#endregion
Now comes the time to introduce the idea of a dependency container. It is really a component to help lazy guys like me. I don’t want to write code to create an authorization manager, a audit manager and an invoicing manager just to create my sales system. I still have to work for the next 30 years and I need to save my fingers.

Without the jokes what I don’t want to do is call the constructors on those components because as the system evolves they will change, when they change things will get broken and I will be in trouble. What I want is a system that is really smart and does all those boring news by me. This system is Unity.

Unity is a dependency container and it can resolve dependencies by me. It I tell him what classes realize my contracts we can the create instances of other classes that required these contracts to be built:

class Program
{
    static void Main(string[] args)
    {
        UnityContainer unityContainer = new UnityContainer();

        unityContainer.RegisterType(typeof(IAuditingManager), typeof(MyAuditingManager));
        unityContainer.RegisterType(typeof(IAuthorizationManager), typeof(MyAuthorizationManager));
        unityContainer.RegisterType(typeof(IInvoicingManager), typeof(MyInvoicingManager));

        SalesSystem system = unityContainer.Resolve<SalesSystem>();

        if (system.AuthorizationSystem.Authenticate("User1", "123"))
        {
            system.InvoicingSystem.CreateInvoice("User1", "Customer1", new string[] { "Product1" }, new decimal[] { 10.0M }, new decimal[] { 1.2M });
        }

    }
}
Try to write the code to replace these by Mock objects :). If you are a fan of TDD you will love this.

Law of Demeter’s, or please count your dots.

The “Law of Demeter” says that a method “M” should only invoke the methods of the following kinds of objects:

· Itself;

· Its parameters;

· Any object it creates;

· Its direct component objects.

When programming C# if more than two dots (excluding the this.) are typed then this law is broken. The study that gave birth to this law arguments that complying to it enhances the maintainability of the system being developed.

It is easy to empirically understand why, the more dots we enter the more details about how an object is built are we revealing and therefore increasing the possibility of breaking something when we change something of that implementation.

I read a lot of code and most people just ignore this problem, I used to as well. Now I break the rule a couple of times a day but I always do it after thinking a couple of seconds about what could I do to respect it. Most of the times it starts by increasing by one the number of arguments of a class constructor and replacing the class by a contract (abstract class or interface).

Why? Well if you can use this.a.b.c then you have set a reference to a contract that represents what c does in your current class. Imagine you do this all the time. Now imagine you replace you c’s contract by a Mock object. Its really easy to test… ;)

If once adding arguments to a constructor was putting effort into the guy that is going to create instances of it, today with the latest enhancements in dependency inversion and dependency containers it is not that bad.

Most containers also support injection in the instance properties and so there is a lot to win on following this simple principle and it is really not that hard. Please read on the article on containers to get your hands dirty.

Monday, September 1, 2008

Subtleties in C# Anonymous methods

Last Friday I had a really bad time figuring out a bug in an anonymous method. The bug was related with the way anonymous methods are implemented in C# and variable scoping. Being the scenes the compiler is generating a class to hold the delegate and references to the variables that are grabbed in the scope of the block holding the delegate. The reference will be hold in memory for the all lifecycle of the delegate object.

The original code was:

foreach (PropertyInfo property in allProperties)
{
    object propertyValue = GetTypePropertyValue(this, property);

    MyObject1 myObject1Value = propertyValue as INotifyPropertyChanged;
    if (myObject1Value != null)
    {
        myObject1Value.PropertyChanged += delegate(object sender, PropertyChangedEventArgs ev)
        {
            this.NotifyPropertyChanged(property.Name);
        };
    }
}
The code fixed code is something like this:
foreach (PropertyInfo property in allProperties)
{
    object propertyValue = GetTypePropertyValue(this, property);

    MyObject1 myObject1Value = propertyValue as INotifyPropertyChanged;
    if (myObject1Value != null)
    {
        string propertyName = property.Name;
        myObject1Value.PropertyChanged += delegate(object sender, PropertyChangedEventArgs ev)
        {
            this.NotifyPropertyChanged(propertyName);
        };
    }
}

The difference is declaring a string variable outside the anonymous method and copying the property’s name into it. Being the scene the compiler will grab a reference to the string that is an immutable object and that reference won’t get modified.

Friday, August 29, 2008

Quick InternalsVisibleTo How-to

The InternalsVisibleToAttribute allows an assembly to make its members marked with the internal visibility modifier visible to other assemblies. I’m not going to discuss whether this is good or bad, as every thing in life its not black or white, its rather gray :).

When the project assemblies are not strongly signed then it is just putting the assembly name without the version, culture and public key token, typically in the “assemblyinfo.cs” file.

[assembly: InternalsVisiblieTo(“MyCompany.MyAssemblyName”)]

When the project assemblies are strongly sign then the first step is getting the public key of the assembly that is going to see the internals. To do this we use the SN command.

sn –Tp <assembly file name>

This will print out the huge public key of the assembly. The it is just adding it to the attribute:

[assembly: InternalsVisibleTo(“MyCompany.MyAssemblyName, PublicKey= 02f…f0c68e6c7”)]

(I shorted the key because it is a huge string).