This week is a very important week for me. I have my first presentation at GASP. A group of architects from Portugal. I’m really excited about it and working real hard to leave a good impression :).
I’m not really an expert on transactions and not that good at database planning and designing. So this plays a really big challenge on me. Which is really great!
I’ve seen a couple of presentations advocating that compensation is a much better approach to dealing with concurrency in SOA than distributed transactions. It is really easy to visualize why, locks across machine boundaries and maybe even organization boundaries smell bad even if I don’t try it. It would be really great if you could base your opinion’s on how much an idea makes you uncomfortable!
In the real world people prefer numbers to emotions when it comes to technology or science. It might be the case that some ISV is trying to sell us a couple of products that will make our life much easier. Maybe it is not really that bad. Maybe we can pay the performance price by gaining in the development costs.
So what I am working on is on a way to measure how much is a compensation based solution better than a distributed transaction solution. One of this days I’m going to put a large enough team developing business code based on this difference and I better get a clear picture of the benefits because this will really hurt $ the company I work for :).
I’ve been working in IT for some years and this is the first time I can remember that I am actually using something coming from college. I’m using science methods and planning and experiment to get some data out. Remember? Built an hypotheses, plan an experiment, collect date, compare it against expect theoretical values and extract conclusions. This is the closest I can get from the excitement of accelerating particles at 7 TeV at LHC :).
I have divided my experiment in several areas:
First I need a data model that exposes the problem of concurrent accesses and a service oriented solution that manages that data model.
If the data model is too small it will affect the experiment because SQL optimizations will work better. The smaller each data row is the better they fit in a single page in SQL.
Than I need a deployment model for that SOA solution:
- Single machine.
- Intra Network.
- Internet
I need hardware to conduct the experiment. And at this moment I am stuck with this problem. I can only get two laptops… The IO capability of SQL server will be greatly influenced by this in the experiment. The goods news is that it will affect evenly (because the hardware is similar) all the services and therefore the numbers will be scaled down compared to real servers (hopefully).
Then I need a way to simulate simultaneous concurrent accesses from a service consumer perspective. This is not easy when you have limits on hardware. Real world solutions are accessed by millions of different client machines and there is no way you can simulate that with threads. You have hardware and software constraints (such as number of connections per socket). But there is not much to think there. The only way I can think of is using different threads.
After this I need to think on the Isolation levels I want to test. I have decided to limit these to:
- Read Uncommitted.
- Read Committed.
- Serializable.
- Snapshot.
Finally I need to decide how I will collect and organize data to extract conclusions. I will have a single table that will record each operation time. I will divide testing into phases so that I can correlate different types of concurrent accesses. For instance there will be a phase where only a master/detail table relation will be accessed. There will be times where all possible scenarios will be accessed simultaneously. With this I can extract different averages and compare then. Enterprise library can really help here.
In the next posts I will post my data models, data collection models and source code for the experiment. Stay close.