Some time ago when working on some performance critical serialization code for a C# project, I was looking deeply into all flavors of available serialization benchmarks. One thing that struck me at the time was that there weren’t that many of them. The most comprehensive I’ve found was this one: http://www.servicestack.net/benchmarks/NorthwindDatabaseRowsSerialization.100000-times.2010-08-17.html Published by service stack so maybe a little biased, but pretty neat nonetheless.

The test published were pretty in line with what I have observed – protobuf was a clear winner, but if you didn’t want to (couldn’t) use open source, DataContractSerializer was best available to you.

After some more testing, I started finding some discrepancies between this report and my tests and they we’re pretty big – it seemed that DataContractSerializer was better than reported, and the whole order was sometimes shuffled . It all boiled down to two major differences that I’ll explain here

 

Data Contract Serializer performance

In the quoted tests, DataContractSerializer was almost 7 times slower than protobuf. That’s a lot. In my tests it was 3 times slower. Big difference, huh ? When I ran the tests on my own machine (the thing I love about people at servicestack.net is that they published on GIT the source to their tests so that you could reproduce it at home) I found that DataContractSerializer was 7 times slower. So obviously, there had to be a difference in the way we used it! And indeed there was, when I added my version it was 2,8 times slower than protobuf running on the same test set. Here’s the difference:

Original:

My version:

See the difference ?

Sure, it’s binary, you can’t always use it, but in most cases it will do, and it’s way faster. I also noticed, that the payload size is about 1/3 more than JsonDataContractSerializer, which is actually pretty good, so if you’re worried by bandwidth usage, it’s an additional plus. Here are the results of running the Northwind test data with just one additional serialization option (binary one) on my machine:

Serializer Larger than best Serialization Deserialization Slower than best
MS DataContractSerializer 5,89x 10,24x 5,38x 7,24x
MS DataContractSerializerBinary 4x 3,01x 2,75x 2,85x
MS JsonDataContractSerializer 2,22x 8,07x 9,59x 9,01x
MS BinaryFormatter 6,44x 10,35x 6,23x 7,81x
ProtoBuf.net 1x 1x 1x 1x
NewtonSoft.Json 2,22x 2,83x 3,00x 2,94x
ServiceStack Json 2,11x 3,43x 2,48x 2,84x
ServiceStack TypeSerializer 1,67x 2,44x 2,05x 2,19x

 

Having this difference aside, the results were more consistent with my tests, but still there were big differences in some cases, which leads us to second important point

 

It’s all about data

There is nothing that will be best for all cases (even protobuf), so you should carefully pick your use case. In my tests, the data was larger, and usually more complex than in Northwind data set used for benchmarking by servicestack.net. Even if you look at their results, the smaller data set the bigger the difference – for RegionDto which has only one int and one string, the DataContractSerializer is 14 times slower (10 times slower on my machine, binary XML for DataContractSerializer is 3,71 slower).

So, does the DataContractSerializer perform better (especially in binary version) if larger objects are involved? It does indeed.

I created a EmployeeComplexDto class that inherits from EmployeeDto and adds some data as follows:

The object being serialized  consisted of not more than 10 Orders, 10 Customers and 10 Friends (for each test the same amount and the same data), and here’s what I got serializing and deserializing this 100 000 times:

Serializer Payload size Larger than best Serialization Deserialization Total Avg per iteration Slower than best
MS DataContractSerializer 1155 1,88x 6407245 6123807 12531052 125,3105 4,11x
MS DataContractSerializerBinary 920 1,50x 2380570 3452640 5833210 58,3321 1,91x
MS JsonDataContractSerializer 865 1,41x 7386162 14391807 21777969 217,7797 7,14x
MS BinaryFormatter 1449 2,36x 9734509 7949369 17683878 176,8388 5,80x
ProtoBuf.net 613 1x 1099788 1948251 3048039 30,4804 1x
NewtonSoft.Json 844 1,38x 2844681 4272574 7117255 71,1726 2,34x
ServiceStack Json 844 1,38x 4904168 5747964 10652132 106,5213 3,49x
ServiceStack TypeSerializer 753 1,23x 3055495 4606597 7662092 76,6209 2,51x

So – are you really ‘stuck’ with DataContractSerializer, or is it quite good ? The answer is of course – it depends. Surely, whenever you can, use the Binary Writer as it is way faster, but even if you do, whether it’s a best choice or not can be answered only with a specific use case in mind – think about what data will you be serializing, how often, etc.

As always with reasoning about performance, best option is to test with your data, in close to real scenarios and (preferably) on similar hardware as the application is going to run in production. The closer you get to this, the more accurate your tests will be.