I’m back after some time spent on hard working, overtiming, doubleshifting and all that since my daytime project was heading for a rough landing. On the up side it gave me a lot to think and write about. Here’s the first topic – queue implementations and thread sleeps.

First of all, what motivated this – quite often (too often) I see a piece of code that looks like this:

Where the queue is shared among producers and consumers. The problem with this code is that if you run it, it’s going to spin. If the queue is more often empty then not (and it most likely is if you don’t have enormous amounts of data flowing through it). A very frequent modification of this ‘pattern’ is one with Thread.Yield() or Thread.Sleep(0) (they’re equivalend) after the while loop.

I used to think that it’s not all that bad, the Yield would give back the scheduled time without spinning much, the waste would be only on context switching, but it wouldn’t be that CPU consuming. Wrong! It does perform slightly better under overall CPU heavy load, but in ‘idle’ mode, it still consumes a whole one CPU core (if there is one consumer).

Another variation of this is something like Thread.Sleep(1) or Thread.Sleep(5). While it helps in idle mode, it has a couple of problems. It still does a lot of context switching completely unnecessairly and it also delays your processing (the bigger the sleep value, the larger the delay, the less context switching)

So how can we do better? I used to use some form of Monitor.Wait and Monitor.Pulse/PulseAll  was a good solution, and while it works fine it’s usually a bit difficult to get completely right and the consequences of not getting it right are dire. If you miss one Pulse or mishandle some exception, you can stop processing or wait while there are messages in queue, or a whole bunch of other unwanted stuff can happen.

Happily, .NET 4.0 introduced a BlockingQueue. This code:

can completely replace what we’ve had above (minus the cancellation, we’ll get there in a moment).

The foreach loop will block until there are elements in the collection and will consume them in a nice thread safe manner. When a producer is done adding, they can call collection.CompleteAdding() and the loop will end for all consumers.

The BlockingCollection contains a bunch of static methods that allow different behaviors dealing with mulltiple collections at one (AddToAny/TakeFromAny) that also can take cancellation tokens and bail out nicely if you decide to quit processing. One word of caution – it’s not guaranteed to behave as if there were any priorities – if you have two blocking collections and you do TakeFromAny, you’ll most likely receive elements from the first one (if there are any) and then from the second one, but it’s not a guaranteed behavior (at least I have not found it documented anywhere). In .NET 4 and 4.5 ILSpy can tell us that it will always behave this way, but in 5.0+ – who knows.

One last thing. This behavior brings me to mind another cool library for .NET – reactive extensions (parts of it – some interfaces – are actually a part of the framework). In short it’s a way of treating your collections (or any deta sources) as sort of ‘data available event producers’ (and more). It’s pretty robust, and it plays nicely with linq. Check it out here: http://msdn.microsoft.com/en-us/data/gg577609.aspx.

 

Everyone (ok, nearly everyone or this wouldn’t happen) knows that if you have a multithreaded application and different threads are accessing shared resources (and some of them do writes), you have to synchronize the access somehow.

Microsoft introduced a neat set of Concurrent Collections in the framework to help with that, but before ConcurrentDictionary there were plain old Dictionary. Some time ago I came across a piece of software that was crashing due to lack of proper synchronization (unfortunately this still happens a lot) and it got me thinking, what is the worst that could happen. So i started experimenting with multiple threads and a Dictionary to see in how many ways can it break. What i found surprised me. Outside of Duplicate Key exception and key not found exception which one may argue are not entirely tied to lack of proper synchronization there are two that stood out.

First one is rather trivial (it’s actually the exception I’ve observed in the aforementioned piece of software, so no surprise there):

Collection is being resized, and we’re still trying to do some operations on it.

The second one is more interesting. At some point my small testing program just didn’t end. I looked at the task manager and this is what I’ve seen:

Inline image 1

25% CPU utilization – most likely one thread going crazy. I connected through WinDBG and here’s some things I’ve seen:

So there is one offending thread – let’s see what it’s doing:

So it seems it’s stuck on inserting – actually inside framework Dictionary class – spinning!

Let’s find this dictionary and see if it’s corrupted:

This is my dictionary, it was declared as Dictionary<long,int>. There is four of them, but that’s because the program was running in a loop creating dictionaries and trying to break them. Remaining three just have not yet been garbage collected as can be observed below

Let’s see if that indeed is the case:

And surely that’s the only one with any gc root at all.

Before we look inside, a word about the structure of Dictionary – there are two important structures inside –

  • int[] buckets – it holds the index of the entries table that contains the first Entry for the hashCode that maps to that bucket
  • Entry[] entries – it contains all the entries in a linked-list fashion (Entry class has a next field pointing to the next Entry)

So the way this works is that if you put object A, a hashCode is calculated for it, then a bucket index is selected by calculating hashCode % buckets.length and the int in the buckets table for that index is pointing at the first entry in that ‘bucket’ that is contained in the entries table. If there is none, yours is the first. If there are some, you iterate the whole chain, and insert after the last one, updating the ‘next’ pointer of the previously last element.

Knowing that, we can look at the structure of my Dictionary:

Ok – just 10 elements inside the Dictionary, let’s see what the Entries look like:

 

For brevity I include only two relevant entries (actually only one is relevant). Entry number six – the next entry is – number seven (so far so good). Entry number seven – the next is … number seven ! Aha! That’s really bad. Concurent inserting (and deleting) without any synchronization has corrupted the Dictionary so that in the Entries linklist one of the entries is pointing at itself!

I would imagine Dictionary code for inserting would look something like this:

With such a corrupted structure this Insertion would then fall into infinite loop spinning on next.next.next… (which is what I could observe)

If you know any other way this could go wrong, send me an email or leave a comment. I’m sure there has to be some other ill behavior resulting from lack of proper synchronization.

As mentioned last time, there is another behavior that may be a surprise (it certainly was to me when I discovered it). Consider the following minimal TCP server code:

It’s as simple as it gets and seems to be quite correct. The lock on the type is an overkill but should do the trick making sure our buffer is consistent.

Client code is just sending single bytes and flushing the stream every time (200 single byte messages).
What is the output? Here it is (a relevant part of it at least):

Notice lines 3 and 4 – the data is printed in reversed order. How come ?!

Let’s look again at the server code:

Console.Write is protected by the lock and it is guaranteed that only one thread will call it at the time, buffer is already copied, what can be wrong ?

The only thing that may rise an eyebrow is the call to BeginRead happening before the processing of the data read from buffer. It cannot interfere with the data though as there is a lock and our data is already in a copied byte array. “When you have eliminated the impossible, whatever remains, however improbable, must be the truth“. The only remaining option is that the thread that is doing the out-of-order write is the same one that is asynchronously called to process the incoming data, but instead of calling BeginRead and returning it actually synchronously calls the async callback. MSDN documentation fails to mention that, but actually that’s exactly what happens. We can verify that easily by adding the following piece of code:

and here is the relevant output:

Notice lines 1 and 8 – it’s the same thread that is calling the rcv method twice. Since in C# locks are reentrant, it doesn’t protect the EndReceive from executing (if they weren’t it would be a self-deadlock).

In the software where I observed it it didn’t happen a lot – the bigger the load the more likely was it to surface. Though when it did happen, the recursion depth was around 5-6.  The reason is that under the hood an OverlappedAsyncResult is used which can detect that there is another async IO in progress and piggyback on it.

Hope you’ll find it useful, I learned the hard way.