Using sosex within windbg to understand IL and Assembly code


Sometimes when debugging managed code within the debugger I would like to see the C# code ,the IL translation for the managed code and the Assembly code for the IL. For example I recently learned that callvirt MSIL instruction must do the null-check before invoking method.

C:\Users\naveen\Documents\Visual Studio 2010\Projects\ConsoleApplication13\Program.cs @ 18:
00bc26d8 8b4dec          mov     ecx,dword ptr [ebp-14h]
00bc26db 3909            cmp     dword ptr [ecx],ecx //NULL Check
00bc26dd ff1508a82900    call    dword ptr ds:[29A808h] (System.String.ToLower(), mdToken: 0600031d)
00bc26e3 8945e8          mov     dword ptr [ebp-18h],eax
00bc26e6 8b45e8          mov     eax,dword ptr [ebp-18h]
00bc26e9 8945ec          mov     dword ptr [ebp-14h],eax

I am not an assembly code expert. The above output is from “!u” sos command. It doesn’t show the c# code except the line number and it is missing IL translation.

The “!mu” from sosex does what I want. It is not yet documented because it is not yet stable as per the output of the command. Here is the output for the same call-stack using sosex’s !mu.

0:000> !mu
THIS COMMAND IS UNDOCUMENTED AND NOT YET STABLE.
test = test.ToLower();
IL_001a: ldloc.0  (test)
IL_001b: callvirt System.String::ToLower
00bc26d8 8b4dec          mov     ecx,dword ptr [ebp-14h]
00bc26db 3909            cmp     dword ptr [ecx],ecx
00bc26dd ff1508a82900    call    dword ptr ds:[29A808h]
00bc26e3 8945e8          mov     dword ptr [ebp-18h],eax
IL_0020: stloc.0  (test)
00bc26e6 8b45e8          mov     eax,dword ptr [ebp-18h]
00bc26e9 8945ec          mov     dword ptr [ebp-14h],eax

The above output has c#,IL and assembly.

Windbg trick – Having custom name for user-defined pseudo-registers


There are 20 user-defined pseudo-registers ($t0, $t1, …, $t19) in windbg/cdb . To have scripts with variable names as @$t0 and @$t1 isn’t helpful for readability. The trick to avoid this is by using the “aS” command.

Here is an example, for a loop variable I would like to use a variable name like “i” instead of “@$t0″ and to use “i”  as a variable  here is the command

aS i “@$t0″

Now”i” is just an alias for “@$t0″.  Here is another example of using “i” in the comparison statement

j (${i} =0) ‘.echo is zero’ ; ‘.echo is not zero’

This is the command to remove the alias without evaluating it.

ad ${/v:i}

Decoding clr20r3 .NET exception – using mono cecil


I have often seen Devs trying to figure out the cause of the app crash without a memory dump. The only information that is available to analyze is the Windows Error Reporting message in the event viewer which would have “Event Name: CLR20r3″ along with Watson bucket information like this.

Fault bucket , type 0
Event Name: CLR20r3
Response: Not available
Cab Id: 0

Problem signature:
P1: unhandledexception.exe
P2: 1.0.0.0
P3: 4ce1e0f1
P4: LibraryCode
P5: 1.0.0.0
P6: 4ce1e0f1
P7: 7
P8: 1f
P9: System.NullReferenceException
P10:

I will demonstrate the steps in identifying the code that caused the app to crash with the above information.Here is the explanation on the Watson Bucket items

  1. P1: unhandledexception.exe – is the Exe File Name
  2. P2:1.0.0.0 – is the Exe File assembly version number
  3. P3:4ce1e0f1- is the Exe File Stamp
  4. P4:LibraryCode- is the Faulting full assembly name
  5. P5:1.0.0.0- is the Faulting assembly version
  6. P6:4ce1e0f1- is the Faulting assembly timestamp
  7. P7:7- is the Faulting assembly method def
  8. P8:1f-  is Faulting method IL Offset within the faulting method
  9. P9:System.NullReferenceException- is Exception type that was thrown

 

Here is the LibraryCode that is mentioned in P4 of the watson bucket

using System;

namespace LibraryCode
{
    public class Foo
    {
        public Foo()
        {
            Console.WriteLine("Constructor");
        }
        public void Test()
        {
            Console.WriteLine("Test");
        }
        public string Bar(string test)
        {
            var x = test;
            return x.ToUpper();
        }
        public string Bar1(string test)
        {
            var x = test;
            return x.ToUpper();
        }
        public string Bar2(string test)
        {
            var x = test;
            return x.ToUpper();
        }
        public string Bar3(string test)
        {
            var x = test;
            return x.ToUpper();
        }
        public string Bar4(string test)
        {
            int j = 10;
            for (int i = 0; i < 10; i++)
            {
                j += i;
            }
            var x = test;
            return x.ToUpper();
        }
    }
}


And here is the code for the Main method calling the LibraryCode

  static void Main(string[] args)
        {
            var f = new Foo();
            var x = Console.ReadKey();
            f.Bar4(null);
        }

The most important items in the above watson bucket are 4,7 ,8 and 9. The item 4 is the assembly that was responsible for the crash which is “LibraryCode”. The item 7 is methoddef that threw the exception which is “7″. To identify the method we would have to dump the IL and here is the command to do that.

ildasm /tokens "C:\temp\LibraryCode.dll" /out=libcode.il

Open the libcode.il in a text editor and look for 06000007. The methoddef starts with 06 and 7 is the hex value and when converted to decimal it is still 7 and that’s how we ended with 06000007. The IL content for the corresponding method def

.method /*06000007*/ public hidebysig instance string
Bar4(string test) cil managed
{
// Code size       42 (0x2a)

With this we know the method that caused the app to crash.

The next step is to identify the faulting IL code within the method. The IL offset that caused the exception to be thrown is 1f (decimal value is 31), and here is the IL Code

IL_001d:  ldarg.1
IL_001e:  stloc.2
IL_001f:  ldloc.2
IL_0020:  callvirt   instance string [mscorlib/*23000001*/]System.String/*01000013*/::ToUpper() /* 0A000012 */
IL_0025:  stloc.3
IL_0026:  br.s       IL_0028

Now mapping the IL code back to C# shouldn’t be hard.

And If you are like me then you would probably want to automate things , so here is doing the same using Mono Cecil

AssemblyFactory.GetAssembly(@"C:\Temp\LibraryCode.dll")
		.MainModule.Types.Cast<TypeDefinition>()
		.ElementAt(1)
		.Methods.Cast<MethodDefinition>().First(md => md.MetadataToken.RID == 7)
		.Body.Instructions.Cast<Instruction>()
		.Select (i => 
			new {Offset = i.Offset, 
			OpCode = i.OpCode.ToString() , 
			Operand = i.Operand != null ? i.Operand.ToString() : string.Empty} )
		.Dump();

Notice the above code looks for methoddef “7″ which is the P7 item in the Watson bucket.The code could have just dumped 31st IL offset which is “ldloc.2″ but that would not help , I like to see the entire method to figure out the cause of the exception.

And here is the output from above code.

We cannot get the call-stack for the crash with just watson buckets.

Script to !SaveAllModules in .NET 4.0 SOS within Windbg


The .NET 4.0 sos doesn’t have save all modules (!SaveAllModules) command. It only has !SaveModule. Recently I was debugging a .NET 4.0 process for which I had to save all the modules. Here is a script that does !SaveAllModules.


!for_each_module .if ($spat ("${@#ImageName}","*.exe")) { !SaveModule ${@#Base} c:\temp\${@#ModuleName}.exe } .else { !SaveModule ${@#Base} c:\temp\${@#ModuleName}.dll }

Using Managed Code to debug Memory Dumps


I happened to notice the new DebugDiag 1.2 and it had COM based API for dbgeng. The sample code were in VB Script. I much comfortable writing managed code compared to VB script. So I  decided to use COM based API in managed code.

Here are couple of ways to solve certain problems using this

  1. Parallel GC Roots :- Getting GC Roots from memory dump is the most time consuming because SOS is single threaded. I use PFX to do them in parallel.
  2. Reconstructing manged objects :- Creating an instance of an object by reading data from the memory dump.

Need to add reference to the COM Library

And in VS2010 (.NET 4.0) by default  COM Interop types have  Embed Interop Types turned on. I couldn’t compile the code with this option. I had to turn off Embed Interop types.

Few extension methods for the DbgObj

static class DbgExtensions {
 public static DbgObj OpenDump(this DbgControlClass dbg, string dumpPath) {
 var path = Environment.GetEnvironmentVariable("_NT_SYMBOL_PATH");
 return dbg.OpenDump(dumpPath, path, path, null);
 }
 public static void LoadSOS(this DbgObj dbg){
 // By default it only loads psscor2.dll and It will not work for .NET 4.0
 dbg.UnloadExtensions();
 // Will load sos based on the framework version
 var sos = dbg.GetModuleByModuleName("clr") == null ? ".loadby sos mscorwks" : ".loadby sos clr";
 dbg.Execute(sos);
 }
 public static IEnumerable<string> DumpHeap(this DbgObj dbg, string typeorMT, bool isMT = false) {
 var parameter = isMT ? "-MT " : "-type ";
 return dbg.Execute("!dumpheap -short " + parameter + typeorMT).Split(new[] { "\n" },
 StringSplitOptions.RemoveEmptyEntries);
 }
 public static string GCRoot(this DbgObj dbg, string address) {
 return dbg.Execute("!GcRoot" + address);
 }
 public static double ReadDouble(this DbgObj dbg, string address, string offset) {
 return (double)Int32.Parse(
 dbg.Execute(string.Format("dd {0}+{1} L1", address, offset)).Replace("\n", "")
 .Split(new[] { " " }, StringSplitOptions.RemoveEmptyEntries)
 .ElementAt(1),
 NumberStyles.AllowHexSpecifier);
 }
 public static string ReadString(this DbgObj dbg, string address, string offset) {
 // The managed string in x86 starts at 8th offset
 return dbg.ReadUnicodeString(ReadDouble(dbg, address, offset) + 8);
 }
 }

Parallel GC Roots

Anybody who is debugged memory dumps for leaks understands the pain of running gcroots within a loop. AFAIK sos is single threaded.  I have had customers who had 24 way CPU’s who wanted to use all the CPU’s to debug memory leaks, but it wasn’t possible.

Here is a code that would make parallel gc roots possible

using System;
using System.Collections.Generic;
using System.Linq;
using System.Globalization;
using DbgHostLib;
namespace ConsoleApplication1 {
class Program {
static void Main(string[] args) {
var dump = new DbgControlClass().OpenDump(@"C:\TestClass.dmp").LoadSOS();
var testInstances = dump.DumpHeap("Test.TestClass");
var roots = testInstances.AsParallel().Select(testclass =>
new DbgControlClass().OpenDump(@"C:\TestClass.dmp").LoadSOS().GCRoot(testclass)).ToList();
Console.Read();
}
}
static class DbgExtensions {
public static DbgObj OpenDump(this DbgControlClass dbg, string dumpPath) {
var path = Environment.GetEnvironmentVariable("_NT_SYMBOL_PATH");
return dbg.OpenDump(dumpPath, path, path, null);
}
public static DbgObj LoadSOS(this DbgObj dbg){
// By default it loads psscor2
dbg.UnloadExtensions();
// Will load sos based on the framework version
var sos = dbg.GetModuleByModuleName("clr") == null ? ".loadby sos mscorwks" : ".loadby sos clr";
dbg.Execute(sos);
return dbg;
}
public static IEnumerable<string> DumpHeap(this DbgObj dbg, string typeorMT, bool isMT = false) {
var parameter = isMT ? "-MT " : "-type ";
return dbg.Execute("!dumpheap -short " + parameter + typeorMT).Split(new[] { "\n" },
StringSplitOptions.RemoveEmptyEntries);
}
public static string GCRoot(this DbgObj dbg, string address) {
var s = dbg.IsClrExtensionMissing;
return dbg.Execute("!gcroot " + address);
}
public static double ReadDouble(this DbgObj dbg, string address, string offset) {
return (double)Int32.Parse(
dbg.Execute(string.Format("dd {0}+{1} L1", address, offset)).Replace("\n", "")
.Split(new[] { " " }, StringSplitOptions.RemoveEmptyEntries)
.ElementAt(1),
NumberStyles.AllowHexSpecifier);
}
public static string ReadString(this DbgObj dbg, string address, string offset) {
// The managed string in x86 starts at 8th offset
return dbg.ReadUnicodeString(ReadDouble(dbg, address, offset) + 8);
}
}
}

The above code loads a memory dump and looks for object type “Test.TestClass” and gets its addresses. Then gets GCRoots in parallel using the AsParallel option.

Reconstructing manged objects

Using the same API it is pretty easy to create an actual instance of a class from a memory dump.  Here is the code for which I dumped the memory.

using System;
namespace Test {
class Program {
static Foo[] foo= new Foo[5];
static void Main(string[] args) {
for (int i = 0; i < 5; i++)
foo[i] = new Foo() { counter = i, Name = "Name " + i.ToString() };
Console.WriteLine(foo);
Console.Read();
}
}
class Foo {
public int counter;
public string Name;
public override string ToString() {
return string.Format("Counter :- {0} , Name :-  {1} ", counter, Name);
}
}
}

Here is the memory structure of Foo

0:005> !do 00f1c660
Name:        Test.Foo
MethodTable: 009b38bc
EEClass:     009b14a4
Size:        16(0×10) bytes
File:        C:\Foo\bin\Debug\Foo.exe
Fields:
MT                  Field          Offset                    Type  VT     Attr    Value Name
79ba2978  4000002        8         System.Int32  1 instance        2 counter
79b9f9ac  4000003        4        System.String  0 instance 00f1c680 Name

Notice the variable “Name” is in the 4th offset and counter is in the 8th offset. I use these offsets to read its contents from the dump.Here is the code that recreates instances of Foo from the memory dump.

class Program {
static void Main(string[] args) {
var dump = new DbgControlClass().OpenDump(@"C:\temp\Foo.dmp").LoadSOS();
var foos = dump.DumpHeap(@"Test.Foo");
foos.Select(s => new Foo() { counter = (int)dump.ReadDouble(s, "0x8"), Name = dump.ReadString(s, "0x4") }).
ToList().ForEach(Console.WriteLine);
Console.Read();
}
}

And here is the output from the above code.

Counter :- 0 , Name :-  Name 0
Counter :- 1 , Name :-  Name 1
Counter :- 2 , Name :-  Name 2
Counter :- 3 , Name :-  Name 3
Counter :- 4 , Name :-  Name 4

There is lot more to explore than what I have shown above. Happy debugging  :)

Downloading PDC10 videos using the new async feature


I knew PDC10 has an OData endpoint which is http://odata.microsoftpdc.com/ODataSchedule.svc/ . The best part about  OData is querying for specific data that we are looking for. And here is my OData url for filtering twitter hashtag #languages


http://odata.microsoftpdc.com/ODataSchedule.svc/Sessions()?$filter=startswith(TwitterHashtag,'%23languages')&$expand=DownloadableContent&$select=DownloadableContent

With the above OData feed I could get urls for low bandwidth mp4′s that I can download. And here is the sample code for filtering


var x =XDocument.Load(@"c:\temp\session.xml").Descendants().AsParallel().Where(xd => xd.Name.LocalName=="Url"
&& xd.Value.Contains("_Low.mp4")).Select (xd => xd.Value);

Now that I have the url’s ,here is the code to download the videos using the new async feature

using System;
using System.IO;
using System.Linq;
using System.Net;
using System.Threading.Tasks;
using System.Xml.Linq;

namespace Test
{
 class Foo
 {
 static void Main(string[] args)
 {
 DownloadAsync();
 Console.Read();
 }
 static async void DownloadAsync()
 {
 var result = new WebClient().DownloadStringTaskAsync("http://odata.microsoftpdc.com/ODataSchedule.svc/Sessions()?$filter=startswith(TwitterHashtag,'%23languages')&$expand=DownloadableContent&$select=DownloadableContent");
 var downloads = XDocument.Parse(await result).Descendants().AsParallel().
 Where(xd => xd.Name.LocalName == "Url" && xd.Value.Contains("_Low.mp4")).
 Select(xd => new WebClient().DownloadFileTaskAsync(xd.Value, Path.GetFileName(xd.Value)));
 await TaskEx.WhenAll(downloads).ContinueWith(_ => Console.WriteLine("Downloading Complete"));
 }
 }
}

Dumping .NET strings to files using Windbg


In this post I would demonstrate how to dump strings from a memory dump /live process to a file. Recently I had to debug a process which had few big strings where I had to analyze its contents. The !dumpobj from sos would only dump partial strings.  I had to dump few hundred XML strings that I had to analyze using some automation. And hence comes the script.

$$ Dumps the managed strings to a file 
$$ Platform x86
$$ Naveen Srinivasan http://naveensrinivasan.com
$$ Usage $$>a<"c:\temp\dumpstringtofolder.txt" 6544f9ac 5000 c:\temp\stringtest 
$$ First argument is the string method table pointer 
$$ Second argument is the Min size of the string that needs to be used filter
$$ the strings
$$ Third is the path of the file 
.foreach ($string {!dumpheap -short -mt ${$arg1}  -min ${$arg2}}) 
{ 

  $$ MT        Field      Offset               Type  VT     Attr    Value Name
  $$ 65452978  40000ed        4         System.Int32  1 instance    71117 m_stringLength
  $$ 65451dc8  40000ee        8          System.Char  1 instance       3c m_firstChar
  $$ 6544f9ac  40000ef        8        System.String  0   shared   static Empty
 
  $$ start of string is stored in the 8th offset, which can be inferred from above
  $$ Size of the string which is stored in the 4th offset
  r@$t0=  poi(${$string}+4)*2
  .writemem ${$arg3}${$string}.txt ${$string}+8 ${$string}+8+@$t0
}

And to use the above script ,copy it to a file and invoke it within Windbg/cdb

$$>a<”c:\temp\dumpstringtofolder.txt” 6544f9ac 5000 c:\temp\stringtest

Parameters to the script

  1. 6544f9ac :- Is the MT to string.
  2. 5000 :- Is the min size of the string that I want to dump
  3. c:\temp\stringtest :- Is the path along with partial filename for each string item

The dumped contents would be in Unicode format and to view its contents use something like this


Console.WriteLine(ASCIIEncoding.Unicode.GetString(File.ReadAllBytes(@"c:\temp\stringtest03575270.txt")));

And here is a sample code that downloads big xml strings ,that can be used by the above script to dump its contents to a folder


using System;
using System.Net;
namespace Test
{
    class Program
    {
        static void Main(string[] args)
        {
            var speakers = new WebClient().DownloadString("http://www.codemash.org/rest/speakers");
            var sessions = new WebClient().DownloadString("http://www.codemash.org/rest/sessions");
            Console.Read();
        }
    }
}
Follow

Get every new post delivered to your Inbox.

%d bloggers like this: