Software engineering blog of Clément Bouillier: 2009

Thursday, July 2, 2009

Expression Trees and Reflection performances

Context

Lots of people think Reflection is evil for performance, but if you can tune your code to avoid some problems, and even better, you can write code as quick as "good old fashion C# code" (I mean writing code without Reflection).

Beware, I am not justifying use of Reflection in any application to implement the "graal of generic code". I think that Reflection could be used in some frameworks, but application code must be strongly typed for readability and maintainability. Example of framework is Object/Relational Mapping, and that's why I would like to talk here of Reflection and performance, because ORM performance can be a performance bottle-neck in an enterprise application.

I worked on a project that uses a "home maid" ORM, which one uses Reflection to create object instances (entities) and to hydrate them. Working on performance with a memory profiler, I saw that a lot of time was spent in Type.GetProperty PropertyInfo.SetValue method (both in same proportion). So I had a look at code and saw that first I could cache the PropertyInfo retrieved by Type.GetProperty, which give me a 50% performance win relative to overhead time spent in Reflection.
It is a first lesson on Reflection performance, cache Reflection objects relative to objects you use.

Then, I think about the old time when we only have Reflection.Emit to improve more this piece of code : you can reduce the overhead to zero using it, I let you find it on the web or see the implementation in NHibernate that uses this, because it seems a little bit painful since it is IL generation, i.e "improved assembler" more or less ;). Second lesson, time spent in Reflection could be reduce to the time taken to setup your application, and then with a zero overhead due to Reflection after that.

Finally, I had a look at new features of C#3.0 and .NET3.5 API, around LINQ and lambda expressions, then I was free from using Reflection.Emit. The main class is Expression found in System.Linq.Expressions. I am not sure, but probably it is less powerful than Reflection.Emit (I think of complex method generation...).

Now let's go into code with a simple case of setting a property one one million objects.

First try (lesson one learned)
I give you the following simple code : first with GetProperty and SetValue inside the iteration, and second, with GetProperty outside the iteration. Here is the code:

   1:  // Set using GetProperty + SetValue

   2:  init = DateTime.Now;

   3:  start = DateTime.Now;

   4:  foreach (SimpleObjectA o in objects)

   5:  {

   6:      prop = o.GetType().GetProperty("Name");

   7:      prop.SetValue(o, "plip", null);

   8:  }

   9:  end = DateTime.Now;

  10:  Console.WriteLine("Set using GetProperty + SetValue : {0} + {1}", start.Subtract(init), end.Subtract(start));

11:

  12:  // Set using GetProperty only once + SetValue

  13:  init = DateTime.Now;

  14:  prop = typeof(SimpleObjectA).GetProperty("Name");

  15:  start = DateTime.Now;

  16:  foreach (SimpleObjectA o in objects)

  17:  {

  18:      prop.SetValue(o, "plip", null);

  19:  }

  20:  end = DateTime.Now;

  21:  Console.WriteLine("Set using GetProperty only once + SetValue : {0} + {1}", start.Subtract(init), end.Subtract(start));

Second try
Then, I tried to make it with Linq.Expressions. And it gives me some interesting thoughts about how to prepare the setter at initialization.
First, I thought to create a strongly typed Expression of the type Action where T is the object which has a property to set, PT is the type of the property to set, in my example, it was very simple, but in ORM context for example, it could be a little bit more tricky. Here is the code for example:

   1:  // Set using typed LambdaExpression (Action<T,I>)

   2:  init = DateTime.Now;

   3:  param = Expression.Parameter(typeof(SimpleObjectA), "x");

   4:  value = Expression.Parameter(typeof(string), "y");

   5:  setter = typeof(SimpleObjectA).GetProperty("Name").GetSetMethod();

   6:  Expression<Action<SimpleObjectA, string>> typedExpression = Expression.Lambda<Action<SimpleObjectA, string>>(Expression.Call(param, setter, value), param, value);

   7:  Action<SimpleObjectA, string> set = typedExpression.Compile();

   8:  start = DateTime.Now;

   9:  foreach (SimpleObjectA o in objects)

  10:  {

  11:      set(o, "plip");

  12:  }

  13:  end = DateTime.Now;

  14:  Console.WriteLine("Set using typed LambdaExpression (Action<T,I>) : {0} + {1}", start.Subtract(init), end.Subtract(start));

In fact, when doing mapping, you do not know the T type, then I think to use Action (Delegate in fact). Now you have to use Delegate and DynamicInvoke which far more bad than SetValue, so I searched for an other solution. For example, here is the code:

   1:  // Set using LambdaExpression + DynamicInvoke

   2:  init = DateTime.Now;

   3:  param = Expression.Parameter(typeof(SimpleObjectA), "x");

   4:  value = Expression.Parameter(typeof(string), "y");

   5:  setter = typeof(SimpleObjectA).GetProperty("Name").GetSetMethod();

   6:  LambdaExpression dynamicExpression = Expression.Lambda(Expression.Call(param, setter, value), param, value);

   7:  Delegate dynamic = dynamicExpression.Compile();

   8:  start = DateTime.Now;

   9:  foreach (SimpleObjectA o in objects)

  10:  {

  11:      dynamic.DynamicInvoke(o, "plip");

  12:  }

  13:  end = DateTime.Now;

  14:  Console.WriteLine("Set using LambdaExpression + DynamicInvoke : {0} + {1}", start.Subtract(init), end.Subtract(start));

So I found this article from Nate Kohari on Late Bound Invocation with Expression Trees. Then we use an Action and bound to real types inside the expression tree (which can be cached). Now, performance are equivalent to use the property setter directly. Here is the code:

   1:  // Set using typed LambdaExpression (Action<object,object>) -> late bound

   2:  init = DateTime.Now;

   3:  param = Expression.Parameter(typeof(object), "x");

   4:  value = Expression.Parameter(typeof(object), "y");

   5:  setter = typeof(SimpleObjectA).GetProperty("Name").GetSetMethod();

   6:  Expression<Action<object, object>> lateBoundTypedExpression = Expression.Lambda<Action<object, object>>(Expression.Call(Expression.Convert(param, setter.DeclaringType), setter, Expression.Convert(value, setter.GetParameters()[0].ParameterType)), param, value);

   7:  Action<object, object> lateBoundSet = lateBoundTypedExpression.Compile();

   8:  start = DateTime.Now;

   9:  foreach (SimpleObjectA o in objects)

  10:  {

  11:      lateBoundSet(o, "plip");

  12:  }

  13:  end = DateTime.Now;

  14:  Console.WriteLine("Set using typed LambdaExpression (Action<object,object>) -> late bound : {0} + {1}", start.Subtract(init), end.Subtract(start));

In conclusion, you will find time in ms I got on my laptop :

Strategy	Initialization time	One million iterations time
GetProperty + SetValue inside iteration	0	2891
GetProperty once & SetValue inside iteration	0	1953
Delagate + DynamicInvoke	187	6234
Latebound Lambda Expressions	0	31
Strongly typed Lambda Expressions	0	31
Property setter directly	0	31

I tried to go with 10 millions iterations, and I got approximatively 10x time in the previous table.
With 60 millions iterations, I try only the three most performant solutions and we starting to see little differences between them (I could have tried a realist situation with several properties bound, but in fact, I would try 10 millions iterations with 6 properties and it should give the same results):

Strategy	Initialization time	60 million iterations time
Latebound Lambda Expressions	0	2219
Strongly typed Lambda Expressions	0	1797
Property setter directly	0	1578

Thursday, April 9, 2009

Applying IoC/DI in application architecture

IoC and DI are two patterns more or less equivalent, I let Martin Fowler explains the details. IoC stands for Inversion of Control and DI for Dependancy Injection.
We will see in this post how to use effectively this pattern inside an application architecture, and which are the benefits of its use.

Application Architecture
There are a lot of architecture styles, but here we will focus on common approches : the layered architecture with generally 3 principal layers (that could be splitted in other layers themself, but we do not talk about it in this post...).
There are several approches :

in "classical" approach (Microsoft UI/BLL/DAL or Java Struts/Bean/DAO), we often talk about Presentation (or UI)/Business/Data
in DDD (Domain Driven Design), if I simply we have UI/Application Core/Infrastructure (I take the word of Jeffrey Pallermo in its "Onion architecture"

The "Onion architecture as described by Jeffrey Pallermo certainly leads to implement some good practices as IoC and DI, but it is also possible to do it in "classical" approach. I will come back on one project of my first project (starting at end of 2005) where I have to lead the application architecture (it was in .NET/C# but all in this post could apply to any OO language).

A little project story
When I started , I was aware of some good practices (thanks to DotNetGuru.org French community) like interfacing your classes, splitting in layers, using POCO, persistance ignorance, applying patterns (principally GoF patterns at this time)...but I wasn't aware of "Onion Architecture" approach then, so I try to implement a "classical" architecture (for the background, we do not use TDD).
This architecture was a little bit heavy to manipulate : for each new Web page, we need to create 3 interfaces and the 3 implementations, adding instanciation in some "home maid" Abstract Factory...and what we can notice it is that we do not realize why exactly we were coding this way, a little bit like robots in fact just because "it was the good practices".
And now I am working again on this project and nothing has changed, and it is great. Why ? Because I have learned some things I did not heard about 3 years ago, and I think it legitimates this architecture (and more over the Onion one in fact). It is what we will see next, but before have a quick overview of IoC/DI.

Overview of IoC/DI
I already give a reference to Martin Fowler post on the subject, but I would like to draw an overview of IoC/DI here for the purpose of the post. I will take an example with interfaces which represents the best exemple of IoC/DI use (even if you could imagine other use cases).
Typically, you have a class Client that relies on an interface IMyInterfaceA, which have an implementation MyImplA which one relies on IMyInterfaceB, which have an implementation MyImplB. So to link all these classes together, I have either:

to call MyImplA constructor in Client and MyImplB constructor in MyImplA, which create dependancy between all the implementations, loosing the benefits of interfacing (even if it is possible to enforce "localized" instanciation...but it needs to carefully control code which could be more )
to implement AbstractFactory pattern (I do not detail this point...)
or to use a IoC/DI framework

The IoC/DI framework will use the configuration file or configuration code to inject the dependancy of MyImplA when needing IMyInterfaceA and so on. Another thing to note is that it will inject recursively the dependancies, then when in Client class, I will request a IMyInterfaceA object, it will inject MyImplA as IMyInterfaceA, and MyImplB as IMyInterfaceB on which relies MyImplA class. I invite you to see details of the different injection mecanism : constructor injection, setter injection, method injection, interface injection...
So now, I think you should start to understand where I would like to go...
Applying IoC/DI
To resume, we have (or we need if starting) :

a layered application architecture using interfaces to separate layers (or even sub-layers)
one of the numerous IoC/DI frameworks (see here for .NET, Spring for Java...)

We are near done : we just have to use an IoC/DI framework in our application. We can certainly create a dependance on the framework everywhere we need it, but I would prefer creating a simple class that will be the only one that relies on the framework.
Then, for example (I do not assert that it is the best architecture), if we have consider the following layers :

UI
Business
- Business Services which take care of business processes = 2 projects, one for interfaces, one for implementation
- Business Entities which bring business rules = 2 projects, one for interfaces, one for implementation
Data Access Services which manage data access and manipulates Business Entities = 2 projects, one for interfaces, one for implementation

We can have the following hard dependancies :

We can add some dependancies, for example from BusinessEntityImpl to IBusinessService or from AnotherBusinessService to IBusinessService, but then a central point is that we have to take care of circular dependancies that could be made (for example if BusinessService depends on IAnotherBusinessService).
Nota that finally you no more need any new statement!

Ok ! But why using this finally ?
Yes, you are right if you are asking this question, but I have some answers beside the "ivory tour architect" answer : "because it is the right way" ;).
Since instanciation is delegated to the IoC/DI framework, you can then imagine a lot of things, but the one I prefer sounds like AOP (Aspect Oriented Programming). But it is not AOP with combined byte code, i.e byte code of generic cross-cutting concerns and your business byte code, it is done at runtime (yes then it could leads to performance issues but only in special cases...).
I give you an example in .NET with EntLib Policy Injection AB, it is really simple to add AOP if you already had IoC/DI :

In your Factory class that is the only one dependent on IoC/DI framework, rather than just calling the IoC/DI framework, call Wrap method of EntLib PolicyInjection class (David Hayden for more details)
Add in EntLib PolicyInjection AB configuration all the cross-cutting concerns you care applying only to a 'defined perimeter" (via policy, matching rules, see MSDN for more details).

Oh great ! How can I apply it to my project ?
If you are starting a new project, consider all the elements given in this post and be sure to understand what you are doing else it could lead to a disaster...
If you have an existing application, I see 2 cases :

either you are in a similar situation to the one I described (using interfaces, layers and other good practices...) and then it would be relatively easy to change your application step by step to reduce the technical debt (if considered like that).
either you are far from this situation, then consider a technical refactoring or if not possible, bear, cry or leave it depends ;)...but note that a technical refactoring could be done step by step also (but each step costs more than in the first case).

Hum...and what about performance ?
I think the performance problem should not be up front problematic to take care. Certainly, you should try to envision the performance issues you could encounter during developement to avoid starting again if the performance issue comes, but we should never say "I will never use this or that because it could lead to performance issues", it has to be studied in every cases.
Then I give here some advices to study performance impacts that could IoC/DI could have :

it is the "dynamic" instanciation (it does not necessarily use reflection since configuration could be done in code or configuration files could be loaded once at start) that costs more than a new, then consider instanciation strategies
consider using singleton pattern, it will avoid several instanciation when your class does support any state management
consider IoC/DI just for some of your objects, the relevant ones, but do not ban IoC/DI because one of your objects does not allow to use it

If I apply this in my last example, I could for example say that I would use IoC/DI for Business Services and Data Access Services classes with singleton strategy and do not use it for Business Entities (but then notice you will need dependancies to Business Entities implementation from Business and Data Access Services...)

Friday, March 27, 2009

TDD meeting at Alt.NET France

On wednesday March, 25th, we have a meeting on TDD at Alt.NET France. It was animated by two Octo consultants, Frédéric Schäfer and Djamel Zouaoui. Again, we reach a new assistance record, around 30 people.

You can find a more detailed minutes I wrote in French on Alt.NET France site. I give here a quick synthesis of the conclusion I take from this meeting.

The presentation started with an overview on why testing or why not testing? The discussion raises a lot of stereotypes : "longer", "more expensive", "I'll make them later" (=never ;)) and so on...

Then we get to when testing is useful (and indeed more or less why) :

get code that just reveals developer intention (not more),
test at the right granularity,
protect yourself against inevitable application changes (regression),
loss of time in repetitive debugging and other test console applications which are not reusable.

Frédéric and Djamel presented some code samples around MasterMind game to underline how to apply TDD. Main points were :

make test first else you never will...
make simple test first, and increment complexity step by step,
don't try to conceptualize/develop the perfect thing on first try (let iterations drive you to the most accurate design),
use some UML diagrams to guide your global intention but do not lose yourself in details,
use the following virtuous circle : intention > test code (do not compile) > code to compile test code that fails the test (red test) > code that passes the test (green) > refactor (your code or your tests, not both in the same iteration)
write test correctly = AAA : Actor, Act, Assertions (be careful of tests without assertions)
a declared bug needs a test to ensure its correction
always run all the tests (which should be automatic and fast)

Finally, we get to deeper subjects, and I conclude that TDD is a practice that lead developers to a set of well-known and good design practices :

Apply patterns : dependency injection, GoF, M-V-VM for WPF was some of the examples, DDD...
Use mocks/stubs to isolate tested objects,
Continuous integration : to build and run tests continuously (and much more...),
Code coverage : beware of 100% coverage illusion, it does mean that your code is perfect because of entry data combinations problem.

I was already convienced with TDD, but I think I get a more precise vision of it now. To conclude, I think TDD brings :

code documentation (executable specifications),
protection againts regressions and during refactoring,
accurate and flexible design for your code.

Thanks again to Octo company, Frédéric and Djamel for hosting and organizing this meeting.

Thursday, March 5, 2009

ALT.NET France is more and more active

I am pleased to participate to ALT.NET France since August 2008, it beguns a little bit earlier on the impulsion of some people that continue to spend time on this great technical social community : Robert Pickering, Julien Lavigne du Cadet, Gauthier Segay, Romain Verdier and more (do not offuscate if I forget some people :o))...

We start with a Google group list and some informal meetings once a month in a bar, and since november, we meet around a defined subject presented by one of the community members. You can find on several blogs some comments about these meetings.

We know reach another step, Julien has setup a new web site, it will (or have to and/or should) be updated by the community.

So enjoy and join us, even if you are not fluent in French, you will be able to share with us, we have several not french members.