Get managed call-stacks in .NET for Registry access using ETW


I was recently debugging managed code which was accessing system registry implicitly because of an external dependent library.  So the first thing I asked was a Procmon log for of registry access. And then I also wanted look at the call-stacks for registry access, which Procmon does provide.

Here is a sample call-stack from Procmon for registry access. I am using linqpad as an example in this

ntoskrnl.exe    CmpCallCallBacks + 0x1c0    0xfffff80002c870d0
ntoskrnl.exe     ?? ::NNGAKEGL::`string’ + 0x4c81d    0xfffff80002c31116
ntoskrnl.exe    KiSystemServiceCopyEnd + 0x13    0xfffff800028d4853
ntdll.dll    ZwSetInformationKey + 0xa    0x779214aa
wow64.dll    wow64.dll + 0x2d252    0x7399d252
wow64.dll    wow64.dll + 0x1008f    0x7398008f
wow64.dll    wow64.dll + 0xcf87    0x7397cf87
wow64cpu.dll    TurboDispatchJumpAddressEnd + 0x24    0x7390276d
wow64.dll    wow64.dll + 0xd07e    0x7397d07e
wow64.dll    wow64.dll + 0xc549    0x7397c549
ntdll.dll    LdrpInitializeProcess + 0x17e2    0x779184c8
ntdll.dll     ?? ::FNODOBFM::`string’ + 0x2bea0    0x77917623
ntdll.dll    LdrInitializeThunk + 0xe    0x7790308e
ntdll.dll    NtOpenKey + 0x12    0x77acf9da
KernelBase.dll    OpenRegKey + 0x134    0x772f2ef3
KernelBase.dll    OpenAltSortsKey + 0x29    0x772de216
KernelBase.dll    IsValidLocale + 0x127    0x772e3271
clr.dll    clr.dll + 0xe4d5f    0x6be34d5f
clr.dll    clr.dll + 0xe4ba8    0x6be34ba8
mscorlib.ni.dll    mscorlib.ni.dll + 0x2ba2f1    0x697ba2f1
mscorlib.ni.dll    mscorlib.ni.dll + 0x2ba2c1    0x697ba2c1
mscorlib.ni.dll    mscorlib.ni.dll + 0x2b9faa    0x697b9faa
mscorlib.ni.dll    mscorlib.ni.dll + 0x2b9e0a    0x697b9e0a
mscorlib.ni.dll    mscorlib.ni.dll + 0x2ba15c    0x697ba15c
mscorlib.ni.dll    mscorlib.ni.dll + 0x2ba106    0x697ba106
mscorlib.ni.dll    mscorlib.ni.dll + 0x2b9db8    0x697b9db8
mscorlib.ni.dll    mscorlib.ni.dll + 0x2b9d34    0x697b9d34
clr.dll    clr.dll + 0x24a2a    0x6bd74a2a
clr.dll    clr.dll + 0x23153    0x6bd73153
clr.dll    clr.dll + 0x231cc    0x6bd731cc
clr.dll    clr.dll + 0x2323b    0x6bd7323b
clr.dll    clr.dll + 0x23415    0x6bd73415
clr.dll    clr.dll + 0x2355a    0x6bd7355a
clr.dll    clr.dll + 0x94844    0x6bde4844
<unknown>    0xb2420    0xb2420
mscorlib.ni.dll    mscorlib.ni.dll + 0x255870    0x69755870
mscorlib.ni.dll    mscorlib.ni.dll + 0x2557cd    0x697557cd
mscorlib.ni.dll    mscorlib.ni.dll + 0x255764    0x69755764
<unknown>    0x4e00d9    0x4e00d9
clr.dll    clr.dll + 0x21db    0x6bd521db
clr.dll    clr.dll + 0x24a2a    0x6bd74a2a
clr.dll    clr.dll + 0x24bcc    0x6bd74bcc
clr.dll    clr.dll + 0x24c01    0x6bd74c01
clr.dll    clr.dll + 0x24c21    0x6bd74c21
clr.dll    clr.dll + 0xece82    0x6be3ce82
clr.dll    clr.dll + 0xecf90    0x6be3cf90
clr.dll    clr.dll + 0xecda4    0x6be3cda4
clr.dll    clr.dll + 0xed199    0x6be3d199
clr.dll    clr.dll + 0xed09a    0x6be3d09a
clr.dll    clr.dll + 0x16af00    0x6bebaf00
mscoreei.dll    mscoreei.dll + 0x55ab    0x726355ab
mscoree.dll    mscoree.dll + 0x7f16    0x75077f16
mscoree.dll    mscoree.dll + 0x4de3    0x75074de3
ntdll.dll    __RtlUserThreadStart + 0x70    0x77ae9d72
ntdll.dll    _RtlUserThreadStart + 0x1b    0x77ae9d45

It is all unmanaged call-stacks , what I was looking for was managed call-stacks. The solution to this was ETW tracing. I use Perfmonitor from the BCL teams codeplex site for getting managed call-stacks.

Here are the steps to get managed call-stacks for registry access

PerfMonitor.exe /registry /stacks /lineNumbers start

Run the process and do the steps for registry access

PerfMonitor.exe /registry /stacks /lineNumbers stop
PerfMonitor.exe  /stacks /lineNumbers print

The above should produce a xml file which is by default named as PerfMonitorOutput.print.xml

And here is the managed call-stacks for registry access.

<Event MSec=  “3449.0880” PID=”3356″ PName= “LINQPad” TID=”4536″ EventName=”RegistryOpen” Status=”0x00000000″ KeyHandle=”0x00000000″ ElapsedTime=”0″ KeyName=”\Registry\Machine\System\CurrentControlSet\Control\Nls\Locale\Alternate Sorts” Index=”0″ InitialTime=”1/3/1601 1:11:43 PM”>
<StackTrace>
<CodeAddress Address=”0xfffff80002c22003″ FullMethodName=”_NULL_IMPORT_DESCRIPTOR” ModuleName=”ntoskrnl”/>
<CodeAddress Address=”0xfffff80002bb5d3b” FullMethodName=”$$VProc_ImageExportDirectory” ModuleName=”ntoskrnl”/>
<CodeAddress Address=”0xfffff800028d4853″ FullMethodName=”KiDeliverApc” ModuleName=”ntoskrnl”/>
<CodeAddress Address=”0x77920e0a” FullMethodName=”ZwOpenKeyEx” ModuleName=”ntdll”/>
<CodeAddress Address=”0x7399d1f1″ ModuleName=”wow64″/>
<CodeAddress Address=”0x7398008f” ModuleName=”wow64″/>
<CodeAddress Address=”0x7397cf87″ ModuleName=”wow64″/>
<CodeAddress Address=”0x7390276d” ModuleName=”wow64cpu”/>
<CodeAddress Address=”0x7397d07e” ModuleName=”wow64″/>
<CodeAddress Address=”0x7397c549″ ModuleName=”wow64″/>
<CodeAddress Address=”0x779184c8″ FullMethodName=”LdrpInitializeProcess” ModuleName=”ntdll”/>
<CodeAddress Address=”0x77917623″ FullMethodName=” ?? ::FNODOBFM::`string&apos;” ModuleName=”ntdll”/>
<CodeAddress Address=”0x7790308e” FullMethodName=”LdrInitializeThunk” ModuleName=”ntdll”/>
<CodeAddress Address=”0x77acf9da” FullMethodName=”NtOpenKey” ModuleName=”ntdll”/>
<CodeAddress Address=”0x772f2ef3″ ModuleName=”KernelBase”/>
<CodeAddress Address=”0x772de216″ ModuleName=”KernelBase”/>
<CodeAddress Address=”0x772e3264″ ModuleName=”KernelBase”/>
<CodeAddress Address=”0x6be34d5f” ModuleName=”clr”/>
<CodeAddress Address=”0x6be34ba8″ ModuleName=”clr”/>
<CodeAddress Address=”0x697ba2f1″ FullMethodName=”System.Globalization.CultureData.InitCultureData()” ModuleName=”mscorlib”/>
<CodeAddress Address=”0x697ba2c1″ FullMethodName=”System.Globalization.CultureData.CreateCultureData(class System.String,bool)” ModuleName=”mscorlib”/>
<CodeAddress Address=”0x697b9faa” FullMethodName=”System.Globalization.CultureData.GetCultureData(class System.String,bool)” ModuleName=”mscorlib”/>
<CodeAddress Address=”0x697b9e0a” FullMethodName=”System.Globalization.CultureInfo..ctor(class System.String,bool)” ModuleName=”mscorlib”/>
<CodeAddress Address=”0x697ba15c” FullMethodName=”System.Globalization.CultureInfo.GetCultureByName(class System.String,bool)” ModuleName=”mscorlib”/>
<CodeAddress Address=”0x697ba106″ FullMethodName=”System.Globalization.CultureInfo.InitUserDefaultCulture()” ModuleName=”mscorlib”/>
<CodeAddress Address=”0x697b9db8″ FullMethodName=”System.Globalization.CultureInfo.Init()” ModuleName=”mscorlib”/>
<CodeAddress Address=”0x697b9d34″ FullMethodName=”System.Globalization.CultureInfo..cctor()” ModuleName=”mscorlib”/>
<CodeAddress Address=”0x6bd74a2a” ModuleName=”clr”/>
<CodeAddress Address=”0x6bd73153″ ModuleName=”clr”/>
<CodeAddress Address=”0x6bd731cc” ModuleName=”clr”/>
<CodeAddress Address=”0x6bd7323b” ModuleName=”clr”/>
<CodeAddress Address=”0x6bd73415″ ModuleName=”clr”/>
<CodeAddress Address=”0x6bd7355a” ModuleName=”clr”/>
<CodeAddress Address=”0x6bde4844″ ModuleName=”clr”/>
<CodeAddress Address=”0x122420″/>
<CodeAddress Address=”0x69755870″ FullMethodName=”System.Version.TryParseVersion(class System.String,value class VersionResult&amp;)” ModuleName=”mscorlib”/>
<CodeAddress Address=”0x697557cd” FullMethodName=”System.Version.Parse(class System.String)” ModuleName=”mscorlib”/>
<CodeAddress Address=”0x69755764″ FullMethodName=”System.Version..ctor(class System.String)” ModuleName=”mscorlib”/>
<CodeAddress Address=”0x1e00d9″ FullMethodName=”LINQPad.Loader.Main(class System.String[])” ModuleName=”LINQPad”/>
<CodeAddress Address=”0x6bd521db” ModuleName=”clr”/>
<CodeAddress Address=”0x6bd74a2a” ModuleName=”clr”/>
<CodeAddress Address=”0x6bd74bcc” ModuleName=”clr”/>
<CodeAddress Address=”0x6bd74c01″ ModuleName=”clr”/>
<CodeAddress Address=”0x6bd74c21″ ModuleName=”clr”/>
<CodeAddress Address=”0x6be3ce82″ ModuleName=”clr”/>
<CodeAddress Address=”0x6be3cf90″ ModuleName=”clr”/>
<CodeAddress Address=”0x6be3cda4″ ModuleName=”clr”/>
<CodeAddress Address=”0x6be3d199″ ModuleName=”clr”/>
<CodeAddress Address=”0x6be3d09a” ModuleName=”clr”/>
<CodeAddress Address=”0x6bebaf00″ ModuleName=”clr”/>
<CodeAddress Address=”0x726355ab” ModuleName=”mscoreei”/>
<CodeAddress Address=”0x75077f16″ ModuleName=”mscoree”/>
<CodeAddress Address=”0x75074de3″ ModuleName=”mscoree”/>
<CodeAddress Address=”0x77ae9d72″ FullMethodName=”__RtlUserThreadStart” ModuleName=”ntdll”/>
<CodeAddress Address=”0x77ae9d45″ FullMethodName=”_RtlUserThreadStart” ModuleName=”ntdll”/>
</StackTrace>
</Event>

Here I was  able to get the managed as well as native call-stacks for registry access by keyname.  And here is my list of posts on ETW

Get Managed and Native Call Stacks from Event Tracing for Windows (ETW)


Here is the TOC of series of posts on ETW

With the latest release of PerfMonitor from the CLR team now we can have Managed call stacks along with native code. How cool is that?  To get managed / native call stacks we need minimum of Vista, this is not available in XP.

Here is the sample source code I am going to be using to demonstrate the managed and the native call-stacks

using System;
using System.Linq;
namespace ConsoleApplication9 {
class Program {
private static void Main() {
Console.Read();
AllocateLOH();
}
private static void AllocateLOH() {
for (int i = 0; i < 10; i++) {
var x = new object[85000];
x.Count();
}
}
}
}

This is a very simple app that allocates large chunks of memory, which will get allocated in the LOH and frequent allocations in LOH causes high CPU. To identify this we would have to attach it to windbg and put a break-point on “WKS::gc_heap::allocate_large_object” and here I have blogged about the same

But with ETW we will not need Windbg, the ETW can do stack walks for the function calls and this makes it really great. To enable this feature first thing is to download the source and compile the code from codeplex . FYI if you are using x64 th tool would not work. It would work on wow64 , so compile it for x86.

After which run the command


PerfMonitor crawlableCLRStacks

This will take a while to complete, it is essentially setting the ngened images to debug attributes so we can get better stack walks. The next command is to start etw trace


PerfMonitor.exe  /lineNumbers /stacks /keyAny:0x1 start

So we are essentially instructing the tool to  get GC Information (/keyAny:0x1) along with stackwalk . The tool provides help which really useful.

After which I start the application and let the complete and then I stop the trace.


Perfmonitor.exe stop

And then dump the trace to a xml file


perfmonitor.exe print

This will dump out a huge file but what we are interested in Large Object Allocations on GC call stacks


<Event MSec=  "6940.8972" PID="365044" PName="ConsoleApplication9" TID="364352" EventName="GCAllocationTick" AllocationAmount="0x00053040" AllocationKind="Large" ClrInstanceID="15">

<StackTrace>

<CodeAddress Address="0x66669909" FullMethodName="ETW::SamplingLog::SendStackTrace" ModuleName="clr"/>

<CodeAddress Address="0x66669a8d" FullMethodName="EtwCallout" ModuleName="clr"/>

<CodeAddress Address="0x6669d870" FullMethodName="CoMofTemplate_qqh" ModuleName="clr"/>

<CodeAddress Address="0x666d5d44" ModuleName="clr"/>

<CodeAddress Address="0x6653ff5b" FullMethodName="WKS::gc_heap::allocate_more_space" ModuleName="clr"/>

<CodeAddress Address="0x66475d43" FullMethodName="WKS::gc_heap::allocate_large_object" ModuleName="clr"/>

<CodeAddress Address="0x666d61d8" FullMethodName="WKS::GCHeap::Alloc" ModuleName="clr"/>

<CodeAddress Address="0x666a39a7" ModuleName="clr"/>

<CodeAddress Address="0x664ef3d8" FullMethodName="AllocateArrayEx" ModuleName="clr"/>

<CodeAddress Address="0x664ef464" ModuleName="clr"/>

<CodeAddress Address="0x1f00ee" FullMethodName="ConsoleApplication9.Program.AllocateLOH()"/>

<CodeAddress Address="0x1f008e"/>

<CodeAddress Address="0x663f213b" FullMethodName="CallDescrWorker" ModuleName="clr"/>

<CodeAddress Address="0x66412c2f" FullMethodName="CallDescrWorkerWithHandler" ModuleName="clr"/>

<CodeAddress Address="0x66412dcc" FullMethodName="MethodDesc::CallDescr" ModuleName="clr"/>

<CodeAddress Address="0x66412e01" FullMethodName="MethodDesc::CallTargetWorker" ModuleName="clr"/>

<CodeAddress Address="0x66412e21" FullMethodName="MethodDescCallSite::Call_RetArgSlot" ModuleName="clr"/>

<CodeAddress Address="0x664c8872" FullMethodName="ClassLoader::RunMain" ModuleName="clr"/>

<CodeAddress Address="0x664c8980" FullMethodName="Assembly::ExecuteMainMethod" ModuleName="clr"/>

<CodeAddress Address="0x664c8794" FullMethodName="SystemDomain::ExecuteMainMethod" ModuleName="clr"/>

<CodeAddress Address="0x664ed6f5" FullMethodName="ExecuteEXE" ModuleName="clr"/>

<CodeAddress Address="0x664ed5f7" FullMethodName="_CorExeMainInternal" ModuleName="clr"/>

<CodeAddress Address="0x664e6db8" FullMethodName="_CorExeMain" ModuleName="clr"/>

<CodeAddress Address="0x704561f0" ModuleName="mscoreei"/>

<CodeAddress Address="0x706f7f16" FullMethodName="ShellShim__CorExeMain" ModuleName="mscoree"/>

<CodeAddress Address="0x706f4de3" FullMethodName="_CorExeMain_Exported" ModuleName="mscoree"/>

<CodeAddress Address="0x77bc9d72" FullMethodName="__RtlUserThreadStart" ModuleName="ntdll"/>

<CodeAddress Address="0x77bc9d45" FullMethodName="_RtlUserThreadStart" ModuleName="ntdll"/>

</StackTrace>

</Event

Here is a simple linq statement to extract the above information

 void Main()
 {
 var events = from e in XDocument.Load(@"c:\temp\test.xml").Descendants("Events").Descendants("Event").Take(5)
 where e.Attribute("EventName").Value == "GCAllocationTick" && e.Attribute("AllocationKind").Value == "Large"
 select new {Amount = Dec(e.Attribute("AllocationAmount").Value) ,
 Stack = e.Descendants("StackTrace").Descendants().Attributes("FullMethodName")};

var f = from e in events
 let stack = e.Stack.Select (s => s.Value )
 select new {AllocatedAmount = e.Amount,callstack = stack};
 }
 public static  int Dec(string hex)
 {
 var x1 = hex.Remove(0,3);
 return int.Parse(x1,System.Globalization.NumberStyles.AllowHexSpecifier);
 }
 

And here is the outcome from the linq code

IEnumerable<> (2 items)

AllocatedAmount callstack
340032
IEnumerable<String> (23 items)
ETW::SamplingLog::SendStackTrace
EtwCallout
CoMofTemplate_qqh
WKS::gc_heap::allocate_more_space
WKS::gc_heap::allocate_large_object
WKS::GCHeap::Alloc
AllocateArrayEx
ConsoleApplication9.Program.AllocateLOH()
CallDescrWorker
CallDescrWorkerWithHandler
MethodDesc::CallDescr
MethodDesc::CallTargetWorker
MethodDescCallSite::Call_RetArgSlot
ClassLoader::RunMain
Assembly::ExecuteMainMethod
SystemDomain::ExecuteMainMethod
ExecuteEXE
_CorExeMainInternal
_CorExeMain
ShellShim__CorExeMain
_CorExeMain_Exported
__RtlUserThreadStart
_RtlUserThreadStart
340032
IEnumerable<String> (23 items)
ETW::SamplingLog::SendStackTrace
EtwCallout
CoMofTemplate_qqh
WKS::gc_heap::allocate_more_space
WKS::gc_heap::allocate_large_object
WKS::GCHeap::Alloc
AllocateArrayEx
ConsoleApplication9.Program.AllocateLOH()
CallDescrWorker
CallDescrWorkerWithHandler
MethodDesc::CallDescr
MethodDesc::CallTargetWorker
MethodDescCallSite::Call_RetArgSlot
ClassLoader::RunMain
Assembly::ExecuteMainMethod
SystemDomain::ExecuteMainMethod
ExecuteEXE
_CorExeMainInternal
_CorExeMain
ShellShim__CorExeMain
_CorExeMain_Exported
__RtlUserThreadStart
_RtlUserThreadStart
680064

Wow now we can see every callstack from ntdll to CLR along with our code in callstack. How cool it is to see managed callstack and native in one single block. This wasn’t possible until now. Thanks to the CLR team. Just like I mentioned in my previous blog post we wouldn’t have to use Windbg to get this information. This is only tip of the iceberg , we can do lot of cool things with this.  I am sure Microsoft is working hard to integrate this within XPERF , but I am going to take a stab and see how I can display this data in a much more graphical way in my future posts. So next time someone complains of Performance issue in managed code you can probably ask them to just get on trace file , which will include everything to figure compared to getting  few memory dumps , perfmon counters , webserver log .

Exploring Event Tracing for Windows (ETW) within Task Parallel Library (TPL) using PerfMonitor


Like I mentioned in my previous post , Today I am going to exploring TPL ETW traces. Yes, I know we could use the built in VS.NET profiler, but in production environment we don’t have VS.NET and guess what even VS.NET parallel profiler uses only ETW to get those good looking graphs.

With the today’s release of source code from BCL, I am going to be using PerfMonitor to parse these traces and dump them out in xml file. TPL uses ‘self-manifests’ (The manifest is dumped as event data) we don’t need a man file to interpret it (since it is in the stream).   The TraceEvent code has the ability to decode these events using the manifest in the data stream.

The sample I am going to be using for tracing is MandelbrotFractals_CSharp which is from the PFX Team

I am starting the trace using xperf , on the TPL ETW provider


xperf -start clr -on 2e5dba47-a3d2-4d16-8ee0-6671ffdcd7b5 -f clrevents.etl

Started the application and then stopped the trace using


xperf -stop clr

I had to make few changes to the TraceEvent Code, these are minor . Especially the code didn’t handle Boolean and it was throwing an exception , anybody can make this simple fix.

To get the trace from the ETL to xml here is the command


PerfMonitor.exe /keepETL print clrevents.etl

In the above command we are instructing the exe to convert the etl to xml and not to delete the ETL file after doing it.

The above command output’s an xml file. Here is a sample output from the actual xml file.


<Event MSec= "24181.0409" PID="1635144" PName="(1635144)" TID="1634556" EventName="TaskStarted" ProviderName="TplEtwProvider" OriginatingTaskSchedulerID="1" OriginatingTaskID="0" TaskID="5"/>

<Event MSec= "24181.0734" PID="1635144" PName="(1635144)" TID="1634556" EventName="TaskScheduled" ProviderName="TplEtwProvider" OriginatingTaskSchedulerID="1" OriginatingTaskID="5" TaskID="6" CreatingTaskID="4" TaskCreationOptions="10500"/>

<Event MSec= "24181.0772" PID="1635144" PName="(1635144)" TID="1634556" EventName="ParallelFork" ProviderName="TplEtwProvider" OriginatingTaskManager="1" OriginatingTaskID="5" ForkJoinContextID="1"/>

<Event MSec= "24413.3543" PID="1635144" PName="(1635144)" TID="1634556" EventName="ParallelJoin" ProviderName="TplEtwProvider" OriginatingTaskSchedulerID="1" OriginatingTaskID="5" ForkJoinContextID="1"/>

<Event MSec= "24413.3624" PID="1635144" PName="(1635144)" TID="1634556" EventName="TaskScheduled" ProviderName="TplEtwProvider" OriginatingTaskSchedulerID="1" OriginatingTaskID="5" TaskID="7" CreatingTaskID="4" TaskCreationOptions="10500"/>

<Event MSec= "24413.3791" PID="1635144" PName="(1635144)" TID="1634556" EventName="TaskCompleted" ProviderName="TplEtwProvider" OriginatingTaskSchedulerID="1" OriginatingTaskID="0" TaskID="5" IsExceptional="False"/>

From the above we can see when the task was scheduled, which is the originating task and when did it complete.

This wouldn’t have been possible without ETW, even with Windbg we cannot these information. This can help us understand what is happening with in TPL , which will enable us to design better.

FYI, Perfmonitor is just an implementation of the TraceEvent API which converts the ETW data to xml. It gives us possibilities to extend this.

In the coming posts I will be exploring how ETW can literally replace debugger in Production environment.   Don’t get me wrong, I am very BIG fan of Windbg and have been using it to understand the OS and the CLR.

Thanks to Vance Morrison for letting me know about TraceEvent.

Undocumented Event Tracing for Windows (ETW) providers in .NET 4.0 (BCL)


Continuing with my explorations in ETW I happened notice that there are few undocumented ETW providers within the .NET 4.0 BCL. These are in the mscorlib.dll and they are

  1. System.Collections.Concurrent.CDSCollectionETWBCLProvider : This is used by the ConcurrentBag , ConcurrentDictionary and the ConcurrentStack
  2. System.Diagnostics.Eventing.FrameworkEventSource : This is used by System.Resources.ResourceManager
  3. System.Threading.CdsSyncEtwBCLProvider : This is used by SpinWait, SpinLock and Barrier
  4. System.Threading.TplEtwProvider  & System.Linq.Parallel.PlinqEtwProvider : This is already documented here

To get list of ETW providers on an application ,get pid of the application and then issue this command


logman query providers -pid 1853452

The outcome of the above command is

For example .from above result ,the provider 2e5dba47-a3d2-4d16-8ee0-6671ffdcd7b5 belongs to TplEtwProvider , it is not a registered with a man file. These providers that i have mentioned above are specifically useful in diagnosing parallel applications. In the forth coming posts I will demonstrate how we can leverage ETW in  understanding parallel applications performance.

Track down DLL loading in .NET using Event Tracing for Windows (ETW)


Sometime ago I was doing some performance work for start-up of an .NET application ,and it so happened the CLR Perf team had blogged specifically about this. One of critical pieces to start-up speed is loading only necessary Dll’s. And loading these Dll’s  on demand reduces the memory consumption, reduces look-up ,which could potentially save few hundred  milliseconds based on different configuration. I am going to be using the same  example as the CLR perf team had used. Here is the sample code.


using System.IO;
using System.Xml.Linq;

internal class Program
{
 private static void ProcessArgs(string[] args)
 {
 if (args.Length == 2 && args[0] == "text")
 File.WriteAllText(args[1], "Here is the data I am writing");
 else if (args.Length == 2 && args[0] == "xml")
 {
 var myXml = new XElement("data", "Here is the data I am writing");
 myXml.Save(args[1]);
 }
 }
 private static int Main(string[] args)
 {
 ProcessArgs(args);

 return 0;
 }
}

In this sample code we wouldn’t want to load System.Xml.Linq.dll to be loaded if the command line option was set to “text”. In the past to diagnose the load of System.Xml.Linq.dll I would have to attached the windbg and set a break-point on load of System.Xml.Linq using sxe:ld and then get the call-stack using !clrstack. With the ETW it is as simple starting and stopping a trace.


xperf -start clr -on e13c0d23-ccbc-4e12-931b-d9cc2eee27e4:0x400010dc:0xffffffff -f clrevents.etl

This is the clr guid for etw “e13c0d23-ccbc-4e12-931b-d9cc2eee27e4:0x400010dc” and the keyword is 0x400010dc, which is for Fusion,Loader,Jit and Stackwalk combined together. So we are essentially asking clr to log events for Jit,Loader, Fusion and as well as stack-walk information.

After issuing the command start the console app

EtwProvider.exe text text

Stop the trace

xperf -stop clr

And here is the sample output from the etl file

Opcode name MethodNameSpace MethodName MethodSignature
JittingStarted “Program” “ProcessArgs” “void (class System.String[])”
Walk [0x719c9909 : 0x719c9a8d]
AssemblyLoad “System.Xml.Linq; Version=4.0.0.0; Culture=neutral; PublicKeyToken=b77a5c561934e089″ 15
Walk [0x719c9909 : 0x719c9a8d]

The above table represents the ordered trace where the Jit compiler is trying to compile ProcessArgs method ,which is in-turn causing the System.Xml.Linq.dll to be loaded. The reason for doing this is because the Jit compiler is not aware of which code path would be taken and that’s the reason for loading the dll even though command line was text and not xml.

The idea behind this post is not show how the dll loading can be avoided, this has been already demonstrated in the CLR blog. I am trying to demonstrate how we can leverage ETW tracing , which is non-invasive and we don’t have to jump though hoops to get the information.

Using .NET 4.0 Event Tracing for Windows (ETW) along with application ETW


At last CLR 4.0 has started showing some love towards ETW. Like I mentioned before ETW is very useful in understanding what is happening on the entire system, along with your application. Today I am going to be demonstrating a simple application, which is going to allocate large arrays 100 times. Along with the allocation, it is also going to log via ETW each time it allocated. The reason is to see the object allocation within our code and GC collection from CLR in the same trace file. How cool it is to see one trace file, in which our application traces sandwiched along with CLR traces. The requirements to build this trace are Windows SDK 7.0, CLR 4.0 and OS should be Windows Vista or above. I am using Windows 7. Here is an existing blog entry for creating the trace xml file. I am going to reusing the same xml for this sample. This xml file or the “.man” file contains information for the ETW framework to understand. This xml can be hand-crafted or use ecmangen.exe (which is the best) to generate. The ecmangen .exe has an inbuilt help, so I don’t want to repeat the same. The next step in the process is to use MC.exe which is the Message Compiler. FYI all the commands are executed from windows sdk command prompt, which has path set to all the tools.

mc.exe Simple.Manifest.xml

This command generates 3 files as output

1. Simple.Manifest.h

2. Simple.Manifest.rc

3. Simple.ManifestTEMP.BIN

The next command would generate the .cs file for writing the events. To do this we could use mc.exe

mc -css ETW Simple.Manifest.xml

So we are essentially instructing the tool to create a static class with static methods along with the defined namespace as ETW. This would generate a .cs file and update the cs file, with a simple static void Main and here is the code after updating the cs file

//---------------------------------------------------------------------
//
//
//     Generated by Message Compiler (mc.exe)
//
//     Copyright (c) Microsoft Corporation. All Rights Reserved.
//
//     Changes to this file may cause incorrect behavior and will be lost if
//     the code is regenerated.
//
//---------------------------------------------------------------------
namespace ETW
{
using System;
using System.Collections.Generic;
using System.Text;
using System.Diagnostics;
using System.Diagnostics.Eventing;
using Microsoft.Win32;
using System.Runtime.InteropServices;
using System.Security.Principal;
    public static class SampleProvider
    {
        //
        // Provider SampleProvider Event Count 1
        //
        static void Main(string[] args)
        {
            for (int i = 0; i < 100; i++)
            {
                var obj = new object[85001];
                ETW.SampleProvider.EventWriteSampleEvent("Allocating object" , i);
            }
        }
        private static EventProviderVersionTwo m_provider = new EventProviderVersionTwo(new Guid("d9b453b9-6230-486c-8dec-c7c5a2230d04"));
        //
        // Task :  eventGUIDs
        //
        private static Guid SampleTaskId = new Guid("9b0655be-219f-439e-9439-b604bd822e22");
        //
        // Event Descriptors
        //
        private static EventDescriptor SampleEvent;
        static SampleProvider()
        {
            unchecked
            {
                SampleEvent = new EventDescriptor(0x3e9, 0x1, 0x10, 0x4, 0xa, 0x1, (long)0x8000000000000001);
            }
        }
        //
        // Event method for SampleEvent
        //
        public static bool EventWriteSampleEvent(string StringData, int IntegerData)
        {
            if (!m_provider.IsEnabled())
            {
                return true;
            }
            return m_provider.TemplateSimpleTemplate(ref SampleEvent, StringData, IntegerData);
        }
    }
    internal class EventProviderVersionTwo : EventProvider
    {
         internal EventProviderVersionTwo(Guid id)
                : base(id)
         {}
        [StructLayout(LayoutKind.Explicit, Size = 16)]
        private struct EventData
        {
            [FieldOffset(0)]
            internal UInt64 DataPointer;
            [FieldOffset(8)]
            internal uint Size;
            [FieldOffset(12)]
            internal int Reserved;
        }
        internal unsafe bool TemplateSimpleTemplate(
            ref EventDescriptor eventDescriptor,
            string StringData,
            int IntegerData
            )
       {
           int argumentCount = 2;
            bool status = true;
            if (IsEnabled(eventDescriptor.Level, eventDescriptor.Keywords))
            {
               byte* userData = stackalloc byte[sizeof(EventData) * argumentCount];
                EventData* userDataPtr = (EventData*)userData;
                userDataPtr[0].Size = (uint)(StringData.Length + 1)*sizeof(char);
                userDataPtr[1].DataPointer = (UInt64)(&IntegerData);
                userDataPtr[1].Size = (uint)(sizeof(int)  );
                fixed (char* a0 = StringData)
                {
                    userDataPtr[0].DataPointer = (ulong)a0;
                    status = WriteEvent(ref eventDescriptor, argumentCount, (IntPtr)(userData));
                }
            }
            return status;
        }
    }
}

And also here is the link for downloading the cs file. Within the main method I am allocating large array which should trigger GC and after which I am logging via ETW. The next step is to compile the rc in to res using

RC Simple.Manifest.rc

The next step is to embed the generate resource file into the exe

c:\Windows\Microsoft.NET\Framework64\v4.0.30128\csc.exe /out:EtwProvider.exe /win32res:simple.manifest.res Simple.Manifest.cs /unsafe

Make sure that you use the CLR 4.0 CSC.exe to compile. After compiling update the Simple.Manifest.xml to the location of the exe in the attributes resourceFileName and messageFileName

<provider name="SampleProvider" message="$(string.ProviderMessage)" guid="{D9B453B9-6230-486C-8DEC-C7C5A2230D04}" symbol="SampleProvider" resourceFileName="f:\downloads\EtwProvider.exe" messageFileName="f:\downloads\EtwProvider.exe"

After which we would have to register the xml file with OS.

WEVTUTIL im Simple.Manifest.xml

FYI if the path of the exe within the xml is not accessible or is not a valid then the above command would throw a warning. So now we are done registering and compiling the code , the next step is to subscribe to these events

xperf -start clr -on e13c0d23-ccbc-4e12-931b-d9cc2eee27e4:0x1CCBD:5  -f clrevents.etl

The above command starts subscribing to clr event traces. I am using the clr etw guid to capture the events. And the events are captured into the clrevents.etl Next we capture events from our application

xperf -start usersession -on "SampleProvider" -f usersession.etl

So we are telling windows to capture events for clr as well as our application. So in the above command we are using the provider name “SampleProvider” instead of guid.

EtwProvider.exe

Then run the application. After running the application , the next step is to stop these traces.

xperf -stop clr
xperf -stop usersession

After stopping we would want to merge the two different traces into one single trace file.

xperf -merge clrevents.etl usersession.etl mergedexample.etl
xperf mergedexample.etl

We should see something like this make sure the sample provider and Microsoft-Windows-DotNETRuntime is checked in the xperf tool

Then zoom in to the section where there are events for sample provider along with CLR events and then right click and choose summary table option.

We should see something like this. Now expand the tree for dotnet as well as the Sample provider. To merge them in to one use the column chooser and uncheck the provider name. It is essentially instructing the xperf not to group by the provider.

Now our custom trace is merged with .NET traces and we can see GC happening when we are allocating objects. This is really cool, so we should be able to figure out what is happening in the runtime when we are doing certain things within the code. Till this day CLR / .NET was always a black box and now it is not any more. I am sure going forward it would only be better. In the forth coming posts I will explain how ETW can help us troubleshoot most of the issues in an non-invasive manner.

Undocumented Silverlight Feature – ETW Tracing


Event Tracing for Windows is tracing mechanism built into windows kernel. It is like log file for windows, but way cool and can log thousands events per second, with very less CPU utilization. From windows vista onwards, we can have windows ETW combined with our application tracing. It would be one trace file which would contain information about windows as well as our application, so it is easy to diagnose and figure what is happening in the OS combined with our application. Today I am not going be discussing about how to do this. I will write about in another post.  This post is going to be about how to have ETW tracing from Silverlight, which is not documented. I am using Silverlight 4.0 and I haven’t checked in another vesion.

I was looking to in to coreclr.dll inside the Windbg. FYI coreclr.dll is the runtime for Silverlight. And I happened to notice ETW. I used the command


x coreclr!etw*

0:026> x coreclr!etw*

6574af13 coreclr!ETW::MethodLog::MethodJitting = <no type information>

657434e5 coreclr!ETW::MethodLog::MethodJitted = <no type information>

6588810e coreclr!ETW::LoaderLog::CollectibleLoaderAllocatorUnload = <no type information>

65886240 coreclr!ETW::MethodLog::SendHelperEvent = <no type information>

6588327f coreclr!ETW::SamplingLog::SaveCurrentStack = <no type information>

6588752d coreclr!ETW::EnumerationLog::StartRundown = <no type information>

65887f30 coreclr!EtwCallback = <no type information>

65886e10 coreclr!ETW::EnumerationLog::IterateAssembly = <no type information>

6578ad0c coreclr!ETW::MethodLog::MethodRestored = <no type information>

65886450 coreclr!ETW::MethodLog::StubInitialized = <no type information>

65886729 coreclr!ETW::LoaderLog::SendModuleEvent = <no type information>

65884d0a coreclr!ETW::EnumerationLog::SendThreadRundownEvent = <no type information>

65884960 coreclr!ETW::LoaderLog::SendDomainEvent = <no type information>

65737ac1 coreclr!ETW::ExceptionLog::ExceptionThrown = <no type information>

658862e6 coreclr!ETW::MethodLog::SendEventsForNgenMethods = <no type information>

658837b0 coreclr!etwStackSupportedEvents = <no type information>

65887a60 coreclr!ETW::EnumerationLog::EndRundown = <no type information>

657a7b57 coreclr!ETW::MethodLog::StubsInitialized = <no type information>

657bb514 coreclr!ETW::CEtwTracer::Register = <no type information>

6588363e coreclr!EtwCallout = <no type information>

656de72b coreclr!ETWTraceStartup::ETWTraceStartup = <no type information>

658865c2 coreclr!ETW::MethodLog::DynamicMethodDestroyed = <no type information>

658853e8 coreclr!ETW::MethodLog::SendMethodEvent = <no type information>

65808659 coreclr!ETW::EnumerationLog::ProcessShutdown = <no type information>

657a67b0 coreclr!ETW::InfoLog::RuntimeInformation = <no type information>

65884e23 coreclr!ETW::LoaderLog::SendAssemblyEvent = <no type information>

65885198 coreclr!ETW::MethodLog::SendMethodJitStartEvent = <no type information>

658872ff coreclr!ETW::EnumerationLog::IterateCollectibleLoaderAllocator = <no type information>

658831d5 coreclr!ETW::SamplingLog::Append = <no type information>

6588840d coreclr!ETW::LoaderLog::ModuleLoad = <no type information>

656de705 coreclr!ETWTraceStartup::StartupTraceEvent = <no type information>

6588744b coreclr!ETW::EnumerationLog::EnumerationHelper = <no type information>

656fa2f8 coreclr!ETW::MethodLog::MethodTableRestored = <no type information>

65886f78 coreclr!ETW::EnumerationLog::IterateModule = <no type information>

656d11a4 coreclr!ETW::SecurityLog::StrongNameVerificationStart = <no type information>

658847f3 coreclr!ETW::GCLog::GCSettingsEvent = <no type information>

So this is cool we can get ETW tracing from Silverlight regarding loader, GC, threading, ThreadPool. But one thing I couldn’t find was a “.man” file like “CLR-ETW.man” for CLR 4.0. This was disappointing. The next thing to do was fired up .NET 4.0, attached to the debugger and issues the command x clr!ETW*.And it was same as the coreclr.dll output above.So the API’s being the same I guessed I could use the clr-etw.man for tracing Silverlight. Then the next thing to do is try tracing Silverlight using CLR-ETW.man. FYI I am using Windows 7 and this will work only from Windows Vista onwards. Make sure the CLR-ETW is registered. To register issue the command (clr-etw.man would be .NET 4.0 directory)


wevtutil im CLR-ETW.man

And to check if the provider is installed issue the command


xperf -providers I | Findstr ".NET"

And the output should be something like this

C:\>xperf -providers I | Findstr “.NET”

a669021c-c450-4609-a035-5af59af4df18     : Microsoft-Windows-DotNETRuntimeRundown

aff081fe-0247-4275-9c4e-021f3dc1da35      : ASP.NET Events

e13c0d23-ccbc-4e12-931b-d9cc2eee27e4    : .NETCommon Language Runtime

e13c0d23-ccbc-4e12-931b-d9cc2eee27e4    : Microsoft-Windows-DotNETRuntime

Try and close all .NET applications like VS2010, so that you don’t get traces from VS2010. From the command line issue the command


xperf -start usersession -on ".NET Common Language Runtime" -f usersession.etl

Start IE and browse a Silverlight site and then stop the trace


xperf -stop usersession

And to view the output of the etl file, issue the command


xperf usersession.etl

There we go now, we are able to see coreclr ETW tracing. From the above picture we can see information like assembly load, module load.  This can be used for GC, Exception, threading etc. So the next time someone tells the Silverlight site is slow on their box, you can ask them to run an etw trace and you would be able to see what is happening on their system.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: