Think#

onsdag 22. februar 2012

Kill Your Dependency Injection Container

First of all, I am quite found of the Inversion of Control pattern. I have been a heavy user of various IoC frameworks for years (almost a decade now). Earlier I never really questioned my use of such tools. They got me going pretty fast. But a year ago I read the blog post Simple-Hickey by Uncle Bob and saw the video published at InfoQ by Rich Hickey. This input really got me thinking, reflecting upon what I have experienced in various projects over the years.

What Rich Hickey discusses is known as "simple over easy" or "simple made easy". Hickey’s concern is “that we have a culture of complexity. That when programmers are given a task, they race ahead and write masses of tangled code, using “easy” frameworks and tools, without giving the problem due thought. That we confuse easiness, with simplicity. (e.g. Rails is easy, it is not simple.)”. Quote taken from Uncle Bob’s blog post.

So back to my own experience. Over the years, have I ever fought with the limitations in a container? My own limited knowledge of a specific container? Have I ever had to write container-specific integration code to get the container to do exactly what I wanted for a specific odd requirement? Did many or most of my co-workers have sufficient knowledge about these tools? Did a framework “force” me to solve a set of problems in a way I did not find satisfactory? To be honest, both me and several colleges over the years have struggled with various tools and frameworks. We have spent days researching what is happening in there, why something goes wrong or not according to plan. We have made forks, extended various parts to customize container configuration behaviour or simply created infrastructural glue to get the framework or the specific tools do what we needed them to do. One example that suddenly came to mind; several containers offer support for object life time management. This is usually trivial to set up in the configuration and start to use. What if you later discover that the default implementation does not play well with your needs or you need a special kind of life-time policy that seems impossible to specify within the rules of the framework? You might end up opening that big black box to fix the issues, often discovered at customers in production code. This is infrastructural jamming, which sure can be fun, but not very productive for the company. If all the developers in the team had full understanding of what happened in the framework, could easily debug and have the confidence to change the code, then maybe the problems could have been fixed in a breeze?

IoC frameworks may be easy to use, getting you "up-and-running" in a short amount of time. But the code inside the framework might be quite complex, and is not simple. Alright, all the slogans tell us to not be too concerned about this complexity, these tools have been heavily tested, have a wast user mass, proven themselves worthy over time, so such complexity can be safely abstracted away. And you should not reinvent the wheel, creating a poor, bug-prone implementation of the same functionality. These are all, in many situations, valid arguments. But sometimes you should stop and think if you really need that big library or framework. I have worked with applications where the overall code footprint from the IoC framework were much bigger than the application code itself. Truthfully, the complexity in these applications could have been reduced a lot by removing the IoC container, as such small applications' configuration need often are limited and trivial. Many of these frameworks are pretty wast, containing code for handling many different concerns. This could be tools for:

Inversion of Control
AOP
Scheduling
Web Framework Integration
Messaging
UI Component Frameworks
Plugin Architecture
Life Time Management
Threading
Serivce Integration (WCF, COM+, .Net Remoting)
Testing
ORM Integrations
Logging
Synchronization
Events
ESB Integration
Security
Various flavours for configuration
Etc….

Of course, these frameworks are split into modules, so you don’t have to use everything. But from my own experience, when one starts to use one module, it is easy to drag in another at a later point in time. Especially if you take company software policies into consideration. If you only use a small fraction of the functionality offered from a module, then you might consider whether you should use that library at all. Another thing to consider, when using some of the features listed above, programmers might not really take the time to acquire ownership by actively understanding the complexity of a problem. You might hear; ”This will be solved by the framework”! At several occasions, I have seen that such abstractions can hurt productivity and customer satisfaction at a later point in time. Some of these complexities cannot really be abstracted away. Sooner or later one must understand what is going on and have a strategy for change/refactor if needed. The hurting really starts when something goes wrong (aka bugs or other deficiencies) and programmers have to dig into the implementation details of the framework. So, if you are using a framework or an IoC container, you and your team must be prepared to invest a lot of time and knowledge into this component to avoid future problems when things get complicated. Some complexity cannot be abstracted away, no matter what tools you use.

Currently I’m working on a green-field project specifying the architecture and the core infrastructure of the application. I’m the only programmer working on the code. In a few months we will scale up and hire in more programmers. I want the application to be simple, very simple. I do not want to drag in third party frameworks if I can avoid it. Any third party library will add amounts of knowledge and complexity that future co-workers need to handle. This will in turn reduce the selection of programmers I can choose from. So how can I cope with this? Should I reinvent the world? Of course not, I need to find a balance. I can find patterns that solve most problems, I can time box an experiment implementing a pattern to see if that serves my needs. I need to ask critical questions, so for instance, what do I want my IoC container to be able to do. What should be it’s main responsibility. Should it offer support for service integration, life time management, session management, aspect oriented programming and so forth? If I had answered yes on all these questions, then I would have opted for a full fledged container like Unity, Castle Windsor or Spring.Net. But remember, I want this to be kept simple, so I only want my IoC container to do one thing; assemble object graphs. All other concerns I must solve with other means, not abstracted away and handled by my container.

So with this in mind, I sat down one evening and spent some hours writing my own light-weight implementation of an IoC container. The dependency injection code amounts to about 250 lines of code to get the heavy lifting done. In addition there is a support class around 130 lines of code facilitating assembly scanning. That is pretty much everything needed to get similar code configuration as presented by the Spring.Net framework to work. Now, my experiment is that these 400 lines of in-house code will not cause nearly as much trouble as an IoC framework would. The reasons for my hyopthesis are:

My self rolled Dependency Injector is tiny
It has one well defined responsibility
It is simple to get an overview of ALL the code
It has a 95% unit test coverage
Since it is simple and small with a good test coverage, programmers will change the code with confidence
Within a short period of time, programmers will take active ownership and responsibility over the code.
Since the component can only do one thing, all other usual problems must actively be addressed by the programmers themselves. They cannot rely on the Dependency Injector to abstract away and solve their problem.

I’m exited to see what the future will bring out of this empirical experiment.

The heading might at first glance sound pretty harsh. It’s because I want your attention by provoking all the frameworks and IoC container lovers. The most important thing to take away from this is to really think thoroughly through if you really need that big IoC framework? Maybe you can be more productive by going really lightweight, choosing an IoC container with a small code footprint. Less code, less complexity...

If you want to play with the code, then download the DependencyInjector here!

onsdag 12. januar 2011

Move from .Net Remoting to WCF

When .Net 3.5 and Windows Communication Foundation was released I was very - very suprised that Microsoft simply dropped the binary remoting support completely in favour for interoperability. I had imagined that there was a space for applications which use the same types on both server and client side in WCF as well. There is nothing in the full name of WCF that indicates that this is for service orientation only. In fact, I still think that WCF should embrace different types of means for serializing messages between two or more endpoints. Hopefully such support will be added later as I guess .Net Remoting will not removed for a few more years. Regarding migrating from .Net Remoting to WCF then there is a good article here which I read a few years ago:
http://msdn.microsoft.com/en-us/library/aa730857(VS.80).aspx

Some weeks ago I decided to put together a small sample which serializes the same types through .Net Remoting and WCF. I wanted to see how much worse WCF performes compared to .Net Remoting, at least that was my working hypothesis. First it was only a shallow User object and then I made the object graph being serialized a bit more complex by adding Groups and Permissions and so forth to at least get a realistic graph from an OO perspective.

///

/// TODO: Document me!
///

/// Steinar Dragsnes
[Serializable]
public class User : IEntity
{
private string firstName;
private string lastName;
private string userName;
private string password;
private int id;

public object Payload { get; set; }

List groups = new List();
.....

///

/// TODO: Document me!
///

/// Steinar Dragsnes
[Serializable]
public class Group
{
private int id;
private string name;

private User administrator;
private List members;
private List permissions;
.....

///

/// TODO: Document me!
///

/// Steinar Dragsnes
[Serializable]
public class Permission
{
private int id;
private string name;
.....

In terms of size, this graph is still very small, but again it was able to illustrate some short comings of the DataContractSerializer's default settings: shared, bi-directional, many to many or circular references are not by default supported when creating the XML to be serialized. But this is not a show stopper, some googling turned up this page: http://social.msdn.microsoft.com/Forums/en-US/wcf/thread/f30ecd17-cac0-4cdc-8142-90b5f411936b/
which demonstrates both how to preserve references and stick in the NetDataContractSerializer. I moved the code snippets into the sample solution and now it works with WCF, but of course it cannot compare with .Net Remoting in terms of performance (also demonstrated by the sample).

So based on my little research on WCF, here are some problems with the current migration process from .Net Remoting to WCF if standard conventions are being used:
1. No serializer that skips building XML before serializing the object graph, creating the XML consumes time.
2. When the XML finally has been streamed, then in most cases where one has a real OO model, the stream size is bigger then that of .Net Remoting.
3. The lack of a LegacyRemotingSerializer that would take care of satisfying all features supported by .Net Remoting out of the box. Interoperability would of course not be possible.
4. References are not automatically preserved, work arounds exist though.
5. Open generics are not supported, not a big issue, but should be supported by a LegacyRemotingSerializer.
6. There should be a minimal impact of change in existing code when migrating from .Net Remoting to WCF.
7. Attribute usage should be avoided, decorating the service interfaces are ok, Spring.Net has exporters that does this for you so that your service interfaces can still stay poco.
8. In general a painful process, should be painless!
9. The end result is degraded performance.

Here is the output from the sample:
....now both channels have had their first serializations and everything is completely initialized....

Time used during serialization: 2.0002
Time used during wcf serialization: 7.0007
Time used during serialization: 1.0001
Time used during wcf serialization: 68.0068
Time used during serialization: 1.0001
Time used during wcf serialization: 3.0003
Time used during serialization: 1.0001
Time used during wcf serialization: 4.0004
Time used during serialization: 1.0001
Time used during wcf serialization: 27.0027
Time used during serialization: 1.0001
Time used during wcf serialization: 4.0004
Time used during serialization: 2.0002
Time used during wcf serialization: 5.0005
Time used during serialization: 1.0001
Time used during wcf serialization: 5.0005
Time used during serialization: 1.0001
Time used during wcf serialization: 5.0005
Time used during serialization: 1.0001
Time used during wcf serialization: 4.0004
Time used during serialization: 1.0001
Time used during wcf serialization: 5.0005
Time used during serialization: 1.0001
Time used during wcf serialization: 6.0006
Client Remoting and WCF tests is complete...

While reading more about the NetDataContractSerializer I realized that there is a bug in the configuration of the WCF infrastructure and that additional custom code must be written to remove all references to the DataContractSerializer as the strategy chosen for building the xml to be binary serialized.

///

/// TODO: Document me!
///

/// Steinar Dragsnes
public class WcfServiceEndpointConfigurer
{
private WcfSerializationMode mode;
private WcfDataContractType contractType;

///

/// Default ctor setting up the instance as default to produce WCF services as .Net Remoting services.
///

public WcfServiceEndpointConfigurer() : this(WcfSerializationMode.AsNetRemoting, WcfDataContractType.NetDataContractSerializer) { }

///

/// General ctor where the invoker may configure how the services should be set up.
///

/// Serialization mode or protocol; net.tcp, http, named pipes etc
/// Data contract type used to build the xml to be serialized.
public WcfServiceEndpointConfigurer(WcfSerializationMode mode, WcfDataContractType contractType)
{
this.mode = mode;
this.contractType = contractType;
}

///

/// Configures the service end point so that it can receive requests from clients.
///

public void ConfigureServerEndPoint()
{
ServiceHost host = new ServiceHost(typeof(T));
if (mode == WcfSerializationMode.AsNetRemoting) WcfAsNetRemoting(host);

host.Open();
}

///

/// Based on the uri, create a transparent proxy casted to type T and create communication channel to establish connection.
///

/// The type of the instance to cast the transparent proxy to.
/// The uri used to create a channel for connecting to the server endpoint.
/// An instance of type T which can be interacted with in a request-response fashion.
public T ConfigureClientEndPoint(string uri)
{
ChannelFactory factory = new ChannelFactory(uri);
Binding unknownBinding = factory.Endpoint.Binding;
if (unknownBinding is NetTcpBinding) factory.Endpoint.Binding = FixNetTcpBinding((NetTcpBinding)unknownBinding);

// This is the only way to globaly switch out DataContractSerializer.
if (contractType == WcfDataContractType.NetDataContractSerializer) SwitchToNetDataContractSerializer(factory.Endpoint);

return factory.CreateChannel();
}

// If the configuration file is completely misconfigured(or not configured for WCF running as .Net Remoting :), then this method should not do anything.
private void WcfAsNetRemoting(ServiceHost host)
{
foreach (ServiceEndpoint endPoint in host.Description.Endpoints)
{
Binding unknownBinding = endPoint.Binding;
// If binding should be NetTcp to get as close as .Net Remoting as possible, then we need to fix a lot of constraints on the binding
// so that we can serialize object graphs of some decent size.
if (unknownBinding is NetTcpBinding) endPoint.Binding = FixNetTcpBinding((NetTcpBinding) unknownBinding);

// This is the only way to globaly switch out DataContractSerializer.
if (contractType == WcfDataContractType.NetDataContractSerializer)
{
SwitchToNetDataContractSerializer(endPoint);
}
}
}

private void SwitchToNetDataContractSerializer(ServiceEndpoint endPoint)
{
foreach (OperationDescription desc in endPoint.Contract.Operations)
{
DataContractSerializerOperationBehavior dcsOperationBehavior = desc.Behaviors.Find();
if (dcsOperationBehavior != null)
{
NetDataContractSerializerOperationBehavior serializer = new NetDataContractSerializerOperationBehavior(desc);
serializer.MaxItemsInObjectGraph = Int32.MaxValue;
int idx = desc.Behaviors.IndexOf(dcsOperationBehavior);
desc.Behaviors.Remove(dcsOperationBehavior);
desc.Behaviors.Insert(idx, serializer);
}
}
}

// Setting most data-transport constraints on the tcpBinding to their respective maximum limits.
private NetTcpBinding FixNetTcpBinding(NetTcpBinding binding)
{
binding.ReaderQuotas = new XmlDictionaryReaderQuotas
{
MaxArrayLength = Int32.MaxValue,
MaxBytesPerRead = Int32.MaxValue,
MaxNameTableCharCount = Int32.MaxValue,
MaxDepth = Int32.MaxValue,
MaxStringContentLength = Int32.MaxValue,
};

binding.MaxReceivedMessageSize = Int32.MaxValue;
binding.MaxBufferPoolSize = long.MaxValue;
binding.MaxBufferSize = Int32.MaxValue;
return binding;
}
}

With this in place, then I saw that I could serialize the same object graphs in WCF much faster then in .Net Remoting! How is this possible? The NetDataContractSerializer even builds up an xml document before serializing the stream of data? To turn this question around, maybe the old BinaryFormatter has a sloppy implementation and there are a lot of possible improvements and easy wins which would have made it much faster? Shouldn't it be more difficult to create a xml representation of an object graph than a binary one?

Anyhow, I wondered if this speed difference would change if I created a really, really deep graph, hundreds of objects deep. When the test was in place I started the sample application but the WCF service crashed. The error message said something about message size or object depth exceeding the configured limit. Some additional googling on the error message pinpointed me to a solution where one increases the various settings on the NetTcpBinding as seen from the code excerpt above. With this small modification I am able to serialize really deep object graphs. This is perfect for old .Net Remoting applications which move big object graphs. Even though you won't have interoperability, you may at least move into a supported framework and maybe enjoy some of the tools for monitoring WCF services.

Here is a print of the speed differences:

Remoting serialization cost in ms: 147.0147
-===-WCF serialization cost in ms: 200.02 -===-
Remoting serialization cost in ms: 291.0291
-===-WCF serialization cost in ms: 245.0245 -===-
Remoting serialization cost in ms: 420.042
-===-WCF serialization cost in ms: 349.0349 -===-
Remoting serialization cost in ms: 658.0658
-===-WCF serialization cost in ms: 466.0466 -===-
Remoting serialization cost in ms: 800.08
-===-WCF serialization cost in ms: 574.0574 -===-
Remoting serialization cost in ms: 1074.1074
-===-WCF serialization cost in ms: 643.0643 -===-
Remoting serialization cost in ms: 1332.1332
-===-WCF serialization cost in ms: 700.07 -===-
Remoting serialization cost in ms: 1500.15
-===-WCF serialization cost in ms: 789.0789 -===-
Remoting serialization cost in ms: 1850.185
-===-WCF serialization cost in ms: 981.0981 -===-
Remoting serialization cost in ms: 2024.2024
-===-WCF serialization cost in ms: 1038.1038 -===-
Remoting serialization cost in ms: 2299.2299
-===-WCF serialization cost in ms: 1284.1284 -===-
Remoting serialization cost in ms: 2520.252
-===-WCF serialization cost in ms: 1342.1342 -===-
Client Remoting and WCF tests is complete...
Press Space to end process...

The interesting thing about this printout is that the object graph is getting deeper and deeper as the root object (an instane of a User) is requested, a number of objects are being added to the user before the request returns. The user instance on the server side will thus grow and get larger and larger, deeper and deeper and thus more difficult to move. Both Remoting and WCF has their own user which they call repeatedly. The impressive thing when reading this output is that WCF has a slight time increase as the root object's footprint increases, while .Net Remoting's time increase is that of WCF multiplied with 2 to 3. In the end, WCF is about twice as fast as .Net Remoting. Don't you believe me? Please download the sample and see for youself!

For additional tips or comments about how to make this blog post better, please contact me.

onsdag 17. februar 2010

TimeZone aware DateTime implementation in C#

Just thought I'll start of by writing some about my experiences for the last 2.5 years working with date time instances rooted in different time zones and how this experience ended up in a library wrapping the .Net 2.0 implementation of DateTime in a new type called LocalDateTime.

To be continued...

Think#

onsdag 22. februar 2012

Kill Your Dependency Injection Container

onsdag 12. januar 2011

Move from .Net Remoting to WCF

onsdag 17. februar 2010

TimeZone aware DateTime implementation in C#

Følgere

Bloggarkiv

Om meg