I've been scratching my head on this one for several days.
After spending a number of posts looking at System.Diagnostics for tracing, and especially at the concept of activity IDs, I decided to put into a project I'm working on. And it all went wrong.
Here's the scenario. I have a WPF application. At start up, it goes straight into a business process that is neatly encapsulated in a model-view-presenter style, written using the Supervising Controller pattern (well, WPF's databinding is so powerful and so pervasive, it'd be rude not to). When I spin up my controller, I create and transfer to a new activity. Events logged at this point correctly live in the new activity.
When I click a button to make something happen, a (custom) event is raised from the view, and handled by the controller. Events logged here are suddenly not part of the activity. The activity Id has been reset to a zero guid.
From the msdn docs on CorrelationManager:
The CorrelationManager class provides methods used to store a logical operation identity in a thread-bound context and automatically tag each trace event generated by the thread with the stored identity.
So obviously my first thought was that this is a threading issue. Turns out it is, but not how you'd expect.
I verified that the thread I started the activity on and the thread used to execute the event handlers were the same, so they should really have the same activity Id.
A serious session of spelunking with Reflector later (and I do mean serious), I had my answer. The CorrelationManager uses Remoting's CallContext to store the activity Id (which has the nice side effect of being transferred across AppDomain boundaries, propagating the activity Id). Cutting a long story short, the data set via the CallContext ends up (indirectly) in an instance of a class called ExecutionContext.
And it's ExecutionContext that causes all the trouble. Again, from the msdn docs:
The ExecutionContext class provides a single container for all information relevant to a logical thread of execution.
The key word there is "logical".
Ever wondered how your security impersonation data is transferred between threads? What about your call stack for Code Access Security? Yep. ExecutionContext.
Every time you start a thread or queue an operation on the thread pool, the current ExecutionContext is carried with it. This is really rather cool when you think about it.
How does that cause problems for me? Well, WPF, like Windows Forms, needs to marshal calls onto the correct thread to prevent deadlocks. That needs an ExecutionContext. If the ExecutionContext is grabbed before I make my change to the activity Id, it won't get propagated. The existing ExecutionContext, which contains the previous, empty activity Id will be used. Practically, this means that all event handlers run under an ExecutionContext created by WPF, and it's this ExecutionContext I am seeing as having no activity Id. And the ExecutionContext is used even if you're already on the correct thread - presumably for CAS and impersonation reasons.
I can see two ways to fix this problem. If you noticed in the description of my problem - I was starting the activity in code that got evaluated from Main, not from the message pump. If I set it from the message pump, I think I'd be using the same ExecutionContext that WPF uses for the event handlers. I suspect that would look something like:
Dispatcher.Invoke(DispatcherPriority.Normal, delegate {
_previousActivityId = Trace.CorrelationManager.ActivityId;
Guid newActivityId = Guid.NewGuid();
_traceSource.TraceTransfer(0, null, newActivityId);
Trace.CorrelationManager.ActivityId = newActivityId;
_traceSource.TraceEvent(TraceEventType.Start, 0, "New activity");
});
But I haven't tried it, so can't be sure - I don't actually know where the WPF ExecutionContext gets created, or even how often it gets created. Relying on this is too flaky for my liking. (But I do like how elegant it is to be able to execute an arbitrary anonymous delegate on the UI thread. Just try doing that in Win32.)
As usual, we can solve this by adding an extra level of abstraction. I think I'm going to add a new class that will encapsulate starting and transferring an activity. All logging will go through this, too, and before each log it will check that the current activity is the correct one. If not, it will set it, log the event and reset it back to what it was. Simple, and effective.
Just to be clear, this won't be a problem for most uses of activities. If you have a bounded activity - i.e. if you both start and stop the activity in a single method, then you'll always be on a single thread of execution, always using the same ExecutionContext and all will be good. If you start an activity and then queue up an operation on the thread pool, the current ExecutionContext (with the current activity Id) will go with it, and again, all will be good. You're only going to have a problem when you have pieces of your activity executing in different contexts, such as with Windows Forms or WPF event handlers.