Wednesday, September 9, 2009

Entity Framework and Plain Old CSharp Objects

Introduction

This will be a new feature on .NET 4.0 but for now we can still get a glimpse off it with this adapter written by one of the EF developers. To use POCOs you still need to write the metadata file(s) that represent the storage scheme, the conceptual scheme and the mappings between them. In .NET 4.0 it will also be possible to use only code to create the metadata (if this is good or bad is out of scope :) ).

When we create a EF model for a given database the developer tools create a EDMX file and use a code generator to output code for what is modeled in the EDMX file. This includes the business entities. This is really cool to make those quick demos we see everywhere in the Internet but on large applications things tend to get complicated. The idea of the domain model is that you create a rich object model to represent the domain and that is not tied to any technical or architectural detail of the implementation. It is as close as object can be to the real world equivalents. But the original implementation of EF lead to domain models that where tied and coupled to the details of Object to Relational Mapping.

One good example I use all the time to illustrate how things can get ugly when ORMs mess with the domain model is the number of times I got to write the Customer class. There is a Customer class with attributes for one ORM, another one has some methods that contain mapping code, others derive from a base class that is tied to the ORM and some put constraints to how you write the class code such as forcing things to be virtual. Wouldn’t it be perfect if I could just write the Customer class using my own way of doing things and could tell the ORM how to map it without having to do any change on my code? The EFPocoAdapter is about that.

Inside the EDMX file there are actually three models. Originally they where separated files and it is still possible to keep them separated. This is the strategy I like the most and the one used in this article. They have the extensions ssdl, csdl and msl respectively storage schema description language, conceptual schema description language and mapping schema language.

In this article I will use SQL Server 2008 R2 and AdventureWorks for SQL Server 2008 as the database. We will map the products table and count the rows on it using POCOs and LINQ.

Implementing a sample application with EF and POCOs

Our demo application will follow the following architecture. We will place our business objects inside a business objects assembly with no further dependencies. The code that connects those objects to the entity framework is named on the diagram as Business Objects Adapter. It is built on top of EF POCO Adapter who is built on top of Entity Framework.

entity framework with POCO architecture

If we would like to divide our domain in more than one assembly then there would be one adapter per business object assembly.

The first step is writing the business objects for the application and our business object could not be simpler:

   1: public class Product
   2: {
   3:     public string Name
   4:     {
   5:         get;
   6:         set;
   7:     }
   8:  
   9:     public int ProductNumber
  10:     {
  11:         get;
  12:         set;
  13:     }
  14: }

The next step is to write the EF metadata that describes this object, the storage table that is associated with it and how they map to each other. Starting by the conceptual model (the Product business object):

   1: <?xml version="1.0" encoding="utf-8"?>
   2: <Schema Namespace="AdventureWorks" Alias="Self" xmlns="http://schemas.microsoft.com/ado/2006/04/edm"
   3:         xmlns:objectmapping="http://code.msdn.microsoft.com/EFPocoAdapter/ObjectMapping.xsd"
   4:         >
   5:   <EntityContainer Name="AdventureWorksEntities">
   6:     <EntitySet Name="Products" EntityType="AdventureWorks.Product" />
   7:   </EntityContainer>
   8:  
   9:   <EntityType Name="Product">
  10:     <Key>
  11:       <PropertyRef Name="ProductNumber"/>
  12:     </Key>
  13:     <Property Name="ProductNumber" Type="Int32" Nullable="false" />
  14:   </EntityType>
  15:   
  16: </Schema>

Next comes the storage model:

   1: <?xml version="1.0" encoding="utf-8"?>
   2: <Schema Namespace="AdventureWorks.Store" Alias="Self" xmlns="http://schemas.microsoft.com/ado/2006/04/edm/ssdl" Provider="System.Data.SqlClient" ProviderManifestToken="2005">
   3:   <EntityContainer Name="dbo">
   4:     <EntitySet Name="Product" EntityType="AdventureWorks.Store.Product" Schema="Production" />
   5:   </EntityContainer>
   6:   
   7:   <EntityType Name="Product">
   8:     <Key>
   9:       <PropertyRef Name="ProductID" />
  10:     </Key>
  11:     <Property Name="ProductID" Type="int" Nullable="false" StoreGeneratedPattern="Identity" />
  12:   </EntityType>
  13:   
  14: </Schema>

And finally the mapping:

   1: <?xml version="1.0" encoding="utf-8"?>
   2: <Mapping Space="C-S" xmlns="urn:schemas-microsoft-com:windows:storage:mapping:CS">
   3:   <EntityContainerMapping
   4:     StorageEntityContainer="dbo"
   5:     CdmEntityContainer="AdventureWorksEntities">
   6:     <EntitySetMapping Name="Products">
   7:       <EntityTypeMapping TypeName="AdventureWorks.Product">
   8:         <MappingFragment StoreEntitySet="Product">
   9:           <ScalarProperty Name="ProductNumber" ColumnName="ProductID"/>
  10:         </MappingFragment>
  11:       </EntityTypeMapping>
  12:     </EntitySetMapping>
  13:   </EntityContainerMapping>
  14: </Mapping>

There are subtle differences between the C# object Product and the Product table. For instance I decided to call ProductNumber to the ProductId table. The entity frameworks supports much more than this simple column mapping but I am trying to build a working demo that is as simple as possible to help on getting started.

The EFPocoAdapter contains a small command line code generator that will output the adapter files for us. I made a small batch file to do this in order to user relative file paths. In my source tree integrating the commands in the pre-build event as suggested in the EFPocoAdapter would generate a command that is too long for the command processor. (One thing I can’t understand is why do we still have limits for paths and commands.)

   1: SET CLASSGEN=C:\prj\source\_ExternalReferences\EFPocoAdapter\EFPocoClassGen\bin\Debug\EFPocoClassGen.exe
   2:  
   3: %CLASSGEN% /verbose "/incsdl:..\AdventureWorks.csdl" 
   4:     "/ref:..\Pedrosal.BusinessObjects\bin\Debug\Pedrosal.BusinessObjects.dll" 
   5:     "/outputfile:PocoAdapter.cs" /map:AdventureWorks=Pedrosal.BusinessObjects
   6:  
   7: %CLASSGEN% /verbose "/incsdl:..\AdventureWorks.csdl" 
   8:     "/ref:..\Pedrosal.BusinessObjects\bin\Debug\Pedrosal.BusinessObjects.dll" 
   9:     "/outputfile:AdventureWorksEntities.cs" /map:AdventureWorks=Pedrosal.BusinessObjects 
  10:     /mode:PocoContainer
  11:  
  12: PAUSE

Finally the test client code:

   1: class Program
   2: {
   3:     static void Main(string[] args)
   4:     {
   5:         using(AdventureWorksEntities ent = new AdventureWorksEntities())
   6:         {
   7:             // test 1 - count the products
   8:  
   9:             var productCount = ent.Products.Count();
  10:             Console.WriteLine("You have {0} products in the database.", productCount.ToString());
  11:         }
  12:     }
  13: }

The connection to the database is made because on the generated code the Entity Framework is instructed to read it from the App.config file connection named AdventureWorksEntities. On that connection we must tell EF where the metadata files are and the easiest way is to have them copied to the output path.

   1: <?xml version="1.0" encoding="utf-8" ?>
   2: <configuration>
   3:   <connectionStrings>
   4:     <add name="AdventureWorksEntities" connectionString="metadata=AdventureWorks.csdl|AdventureWorks.ssdl|AdventureWorks.msl;provider=System.Data.SqlClient;provider connection string=&quot;Data Source=.;Initial Catalog=AdventureWorks2008;Integrated Security=True;MultipleActiveResultSets=True&quot;" providerName="System.Data.EntityClient" />
   5:   </connectionStrings>
   6: </configuration>

Download sample here.

Have fun,

Thursday, September 3, 2009

About debugging memory problems – The case of the Web Cache

From time to time it comes the day where we have to test how an application behaves in terms of memory on heavy load. We might be chasing a memory leak or tuning some caching logic or just making a report about the way an application uses memory. There are a lot of tools on the market that help doing these kind of tests. At this moment I use AQtime from AutomatedQA.

Nevertheless there are plenty of reasons to know an learn how to use the basic tools. One good reason is that you will never get those IT guys to install these advanced tools on a productive server :). The other is that these tools work at a higher level of abstraction, they hide the internals and therefore are bad for learning. Once you know to do things the dirty old way you will love the time savings you get if you can just turn on your performance testing suite and get all the data you want with a couple of clicks.

I encourage everyone to learn with the basics and them get the benefits (time) of using a performance test suite.

In my case it was a web application at Primavera that was needing some tuning in terms of memory.

Setting Performance Counters

When I am looking at memory optimization or at finding memory related bugs these are the most common counters I use:

  • \.NET CLR Memory(w3wp)\*
  • \W3SVC_W3WP(application pool)\Active Requests
  • \W3SVC_W3WP(application pool)\Active Threads Count
  • \W3SVC_W3WP(application pool)\Current File Cache Memory Usage
  • \W3SVC_W3WP(application pool)\Requests / Sec
  • \.NET CLR Memory(w3wp)\Large Object Heap size
  • \Memory\% Committed Bytes In Use
  • \Memory\Committed Bytes
  • ASP.NET Apps v2.0.50727/Cache Total Entries

To set these up use the Performance and Reliability Monitor:

performance monitor program shortcut

Them create a new data collector set:

create the data collector set

Choose to create manually and them next:

created the data collector set wiz - step 1

Choose to store the performance counter data and then next:

created the data collector set wiz - step 2

Choose the required counters by pressing Add:

 created the data collector set wiz - step 3

Choose the .NET CLR Memory group (selects all the counters inside) and choose the w3wp process (if it is not running start it by browsing to one of the site pages.):

 created the data collector set wiz - step 4

The add the Memory counters:

 created the data collector set wiz - step 5

In this case I typically do not add the group but only the two mentioned ones:

created the data collector set wiz - step 6

Done adding counters them press ok and it returns to the wizzard:

 created the data collector set wiz - step 7

After pressing next we can choose where to store the files:

 created the data collector set wiz - step 8

And the last wizard page allows us to change the user running the collection:

 created the data collector set wiz - step 9

Now we have to start collecting data:

start the collection

And notice that a report is created to show the collection results:

collection running

The actual results are only shown when the collection is stopped. It can be started and stopped and in each time a new report is created.

What do you get out of performance counting

There are no receipts off what is the chart that will guide you to find the problem. You have to put your detective hat on and start following you intuition :). But here are some examples of charts that can mean something about what is making your application misbehave.

In this application the memory footprint was increasing a lot and we where suspecting a memory leak. But after some time the memory entered a constant average. This was under heavy load using a stress testing tool. On the chart I am ploting the GC2 heap, the GC 2 heap collections and bytes on all heaps.

It is easy to see that GC2 is what is causing the memory increase. The blue line (bytes in all heaps) is following the pink one (GC 2) GC collections are running normally and so all these objects must be rooted some how.

memory evolution 01 

I attached WinDbg to the process and checked the roots of a couple of random objects. In this case Web UI objects. They where being rooted by the Cache System. So I added a couple of more counters.

In this chart we can see the evolution of cache entries, the number of hits and misses. We can see the the increase in memory tends to follow the increase of cache entries and as time passes and memory enters a stable value the number of hits increases. At this time most of the possible pages are already in cache. Also the old pages are dying and being replaced by new ones and this is why memory is increasing and decreasing in small deltas.

cache hits and memory before

This application was using a lot of memory because it was adding items to Cache with the CacheItemPriority.NotRemovable option and with a big expiration time. The server was having an hard time getting rid of all those items in the cache in the first minutes. Them some of them started to die and memory was being reclaimed.

Tuning the performance

Recomendation 1 - Strings

One good thing to look at is how much memory strings are using. Some applications tend to abuse on string concatenation and that fills the memory with intermediate results. Every + or & operator creates a new string in memory. So if you are adding 4 strings together to get the final result you created three unneeded instances. The alternative to that is using the StringBuilder object.

Recommendation 2 – Stored Procedures

Some applications also construct sql code for operations by formatting strings or concatenating them together. Besides paying the penalty of not having compiled optimized sql statements they are also adding pressure to the GC because they are creating string objects all the time. Whenever possible use stored procedures.

Recommendation 3 – Caching has side effects

At a first impression caching is always good. What is best, to always compute a new page or to return a page that is already there? Most people would answer caching. Well the correct answer is a little bit more complicated. In principle caching is good but one must level caching and memory efficiency. If the memory of a server gets to low it will have problems creating replies event if most of the stuff that has to rendered is in cache. Each page that has to be output always needs some amount of memory to be served and if that memory is low then GC has to cleanup stuff and the server will serve less requests.

So when it comes to caching I would say that it should be configurable, CacheItemPriority.NotRemovable should be used in special cases and most of times we should let the caching algorithms work out what should be taken out of memory. Also a application that creates a lot of memory pressure should use the priority levels correctly and not place everything with the same priority. Structures that are shared by everyone and have a higher probability of being reused should be a higher level of priority.

Recommendation 4 – Session and Context

Web applications are very special in terms of session and context. When you are designing something that is hosted on the client and using its resources you have the memory of that client to store your stuff. But at the server the same amount of memory has to be shared by the 10000 users of the server.

The first bad idea is trying to give the users the same experience on the web as they have on desktop. Web applications should be as contextless and as sessionless as possible. The worst thing you can do to your costumers is adding memory pressure to the server. There are alternatives to store things in memory being one of them storing them in a data server. That same data can them be cached. It is true that it is still using memory but it is memory that can be reclaimed if needed.

Recommendation 5 – Study how the users interact with the applications and build on an architecture that can be tuned.

Most decisions here have a lot to do with the way the users interact with the application. With this knowledge one can choose stress tests and load tests that are closer to the way the users push the application. This will naturally lead to better tuning.

Conclusion

When it comes to performance the only process that can lead to good results is performance and stress testing. These tests have to be done from day one because they can lead to architecture changes. This is not the case with most development processes. Most teams tend to do these tests when the product is already in a advanced stage and changes are hard to do.

It is important to note that testing and tuning are different things. Testing is about making sure that the product is still within acceptable behavior. I find that optimization works out best on a later stage. One should only go and change the architecture and code if on testing the behavior is below the requirements.

Monday, August 3, 2009

Expression Blend 3 and Vmware Player compatibility problem

I had a problem installing blend 3 on vmware player 2.5.2 build-156735. Most of the controls on the setup window did not appear. After some googling I found that the problem was on the graphics driver. It seams the driver has problems when combining 3D and 2D content in directx.

As suggested the workaround is disabling the 3d support. I did this using dxdiag tool that can be found on the system32 folder. Just type dxdiag (if you have dx9) on the run textbox and it should take you there.

disable 3d acceleration

You only need to disable it to run the setup. When can enable it back afterwards. If you find any problem when using it, specially when mixing 3D content just repeat the process.

Have fun,

Thursday, July 30, 2009

ADO.NET Data Services – 1.5 CTP1

 

Finally got a chance to try out this new preview. Here is a snippet that puts a simplified customers list in an atom feed. This was really quick. The only thing I had to do was to update internet explorer because I was using a XP virtual machine :).

   1: namespace DemoServerSite
   2: {
   3:     public class Customer
   4:     {
   5:         public int ID { get; set; }
   6:         public string Name {get;set;}
   7:         public string Code {get;set;}
   8:         public string Street {get;set;}
   9:         public string Country {get;set;}
  10:         public string PostalCode {get;set;}
  11:         public string FiscalId {get;set;}
  12:     }
  13:  
  14:  
  15:     public class CustomersDataContext
  16:     {
  17:         private List<Customer> source;
  18:  
  19:         public CustomersDataContext()
  20:         {
  21:             source = new List<Customer>();
  22:             source.Add(new Customer {ID=0, Name = "Jose", Code = "C0001" });
  23:             source.Add(new Customer {ID =1, Name = "Maria", Code = "C0002" });
  24:         }
  25:  
  26:         public IQueryable<Customer> Customers
  27:         {
  28:             get
  29:             {
  30:                 return this.source.AsQueryable();
  31:             }
  32:         }
  33:     }
  34:  
  35:     public class WebDataService1 : DataService<CustomersDataContext>
  36:     {        
  37:         // This method is called only once to initialize service-wide policies.
  38:         public static void InitializeService(IDataServiceConfiguration2 config)
  39:         {
  40:             // TODO: set rules to indicate which entity sets and service operations are visible, updatable, etc.
  41:             // Examples:
  42:             config.SetEntitySetAccessRule("*", EntitySetRights.AllRead);
  43:             config.SetServiceOperationAccessRule("*", ServiceOperationRights.All);
  44:         }
  45:     }
  46: }

This new technology seems to promise because it is really lightweight. Integrates nicely with entity framework and we can tune the behavior by using interceptors. I’ll try to publish some examples of that soon.

Have fun,

Wednesday, July 29, 2009

Building WCF Custom Channels – A channel project template

 

I am in the process of learning in-depth how to write channels for WCF. One of the hard things was getting an idea of the parts we have to build to achieve this goal.

In this process I built this simple template that contains the major parts and a getting started client/server example to debug it and get it running. If you hit F5 you will get a NotImplementedException and my process is to implement each method as it get called and throws. This seams like a good way of learning how the stack is organized and in what sequence things happen.

This was built by stripping out all the code of the NullTransport project here by Roman Kiss. So all credits go to him :).

Download it here.

Have fun,