CLR: Under the Hood Brad Abrams [email protected] Program Manager http://blogs.msdn.com/brada Health Warning • • • • This talk dives deep! Examines internal data structures They will change! Internals knowledge not needed to.
Download ReportTranscript CLR: Under the Hood Brad Abrams [email protected] Program Manager http://blogs.msdn.com/brada Health Warning • • • • This talk dives deep! Examines internal data structures They will change! Internals knowledge not needed to.
CLR: Under the Hood Brad Abrams [email protected] Program Manager http://blogs.msdn.com/brada Health Warning • • • • This talk dives deep! Examines internal data structures They will change! Internals knowledge not needed to write .NET apps Quick Refresher: The Unmanaged World C++ source VB6 source Compiler Compiler .obj PCode/x86 .exe Linker VBRun .exe Hardware Platform Loader Hardware Platform Quick Refresher: The CLR Compilation Source Code Language Compiler Native Code JIT Compiler Execution Code (IL) Assembly Metadata At installation or the first time each method is called Agenda 1. Compilation csc.exe /debug helloAtlanta.cs 2. Packaging and Metadata helloAtlanta.exe 4. Execution CLR 3. Loading and Layout mscorlib.dll 5. Runtime Services Q&A 1. Compilation HelloAtlanta.cs using System; class Output { private int year; public Output(int year) { this.year = year; } public void SayHello() { Console.WriteLine("Hello, it’s the year " + year); } } class HelloAtlanta { static void Main(string[] args) { new Output(2004).SayHello(); new Output(2005).SayHello(); } } Agenda 1. Compilation csc.exe /debug helloAtlanta.cs 2. Packaging and Metadata helloAtlanta.exe 4. Execution CLR 3. Loading and Layout mscorlib.dll 5. Runtime Services Q&A 2. Packaging: Assemblies PE HEADER Manifest Assembly Type CIL Files used Type Type Metadata Type Exports Resources Resources Strings/Blobs helloAtlanta.exe 2. Metadata IL – Intermediate Language • IL – The language for execution • Independent of CPU and platform – Created by Microsoft, external commercial and academic language/compiler writers • Stack based virtual machine: 1+2+3-4 IL_0001: IL_0002: IL_0003: IL_0004: IL_0005: IL_0006: IL_0007: ldc.i4.1 ldc.i4.2 add ldc.i4.3 add ldc.i4.4 sub Evaluation Stack 2. Metadata IL – Intermediate Language • IL – The language for execution • Independent of CPU and platform – Created by Microsoft, external commercial and academic language/compiler writers • Stack based: 1+2+3-4 IL_0001: IL_0002: IL_0003: IL_0004: IL_0005: IL_0006: IL_0007: ldc.i4.1 ldc.i4.2 add ldc.i4.3 add ldc.i4.4 sub 0000 0001 Evaluation Stack 2. Metadata IL – Intermediate Language • IL – The language for execution • Independent of CPU and platform – Created by Microsoft, external commercial and academic language/compiler writers • Stack based: 1+2+3-4 IL_0001: IL_0002: IL_0003: IL_0004: IL_0005: IL_0006: IL_0007: ldc.i4.1 ldc.i4.2 add ldc.i4.3 add ldc.i4.4 sub 0000 0002 0000 0001 Evaluation Stack 2. Metadata IL – Intermediate Language • IL – The language for execution • Independent of CPU and platform – Created by Microsoft, external commercial and academic language/compiler writers • Stack based: 1+2+3-4 IL_0001: IL_0002: IL_0003: IL_0004: IL_0005: IL_0006: IL_0007: ldc.i4.1 ldc.i4.2 add ldc.i4.3 add ldc.i4.4 sub 0000 0003 Evaluation Stack 2. Metadata IL – Intermediate Language • IL – The language for execution • Independent of CPU and platform – Created by Microsoft, external commercial and academic language/compiler writers • Stack based: 1+2+3-4 IL_0001: IL_0002: IL_0003: IL_0004: IL_0005: IL_0006: IL_0007: ldc.i4.1 ldc.i4.2 add ldc.i4.3 add ldc.i4.4 sub 0000 0003 0000 0003 Evaluation Stack 2. Metadata IL – Intermediate Language • IL – The language for execution • Independent of CPU and platform – Created by Microsoft, external commercial and academic language/compiler writers • Stack based: 1+2+3-4 IL_0001: IL_0002: IL_0003: IL_0004: IL_0005: IL_0006: IL_0007: ldc.i4.1 ldc.i4.2 add ldc.i4.3 add ldc.i4.4 sub 0000 0006 Evaluation Stack 2. Metadata IL – Intermediate Language • IL – The language for execution • Independent of CPU and platform – Created by Microsoft, external commercial and academic language/compiler writers • Stack based: 1+2+3-4 IL_0001: IL_0002: IL_0003: IL_0004: IL_0005: IL_0006: IL_0007: ldc.i4.1 ldc.i4.2 add ldc.i4.3 add ldc.i4.4 sub 0000 0004 0000 0006 Evaluation Stack 2. Metadata IL – Intermediate Language • IL – The language for execution • Independent of CPU and platform – Created by Microsoft, external commercial and academic language/compiler writers • Stack based: 1+2+3–4=2 IL_0001: IL_0002: IL_0003: IL_0004: IL_0005: IL_0006: IL_0007: ldc.i4.1 ldc.i4.2 add ldc.i4.3 add ldc.i4.4 sub 0000 0002 Evaluation Stack 2. Metadata IL – Verification • When processing IL, runtime verifies the IL to make sure it’s “safe” – Every IL construct is called with the correct amount of stack parameters – Every method is called with the correct type and number of parameters • • • Helps prevents buffer overflows, underruns Helps prevent security holes Safety allows multiple managed applications in the same process Agenda 1. Compilation csc.exe /debug helloAtlanta.cs 2. Packaging and Metadata helloAtlanta.exe 4. Execution CLR 3. Loading and Layout mscorlib.dll 5. Runtime Services Q&A 3. Loading and Layout Startup logic • OS hands the image to a CLR shim (mscoree.dll) • Which starts the runtime • Runtime performs the following: – Loads assembly in to memory and sets up the MD reader – Resolves immediate dependencies – Slurps metadata, creates internal data structures – Starts execution at Assembly entry point Agenda 1. Compilation csc.exe /debug helloAtlanta.cs 2. Packaging and Metadata helloAtlanta.exe 4. Execution CLR 3. Loading and Layout mscorlib.dll 5. Runtime Services Q&A 4. Execution Invocation 78abed08 System.Object.ToString() Private execution engine memory Objects in GC heap 78a7fe50 System.Object.Equals(System.Object) 78a74de8 System.Object.GetHashCode() 78abed58 System.Object.Finalize() 03690ceb Output..ctor(Int32) 03690cfb Output.SayHello() 03790118 MethodTable (runtime info) [03690cfb] Prestub Dispatch JIT Compiler 2004 03790118 push edi push esi push ebx … SayHello() Native Code newobj call instance void Output::.ctor(int32) instance void Output::SayHello() 4. Execution Invocation – JIT output Output.SayHello() push mov edi edi,ecx : mov mov call mov mov mov mov call mov cmp jne mov call mov mov call esi,dword ptr ds:[01AF201Ch] ecx,788F34E8h FD34FF55 // JIT_DbgIsJustMyCode edx,eax eax,dword ptr [edi+4] dword ptr [edx+4],eax ecx,esi 7530FA52 esi,eax dword ptr ds:[01AF1070h],0 0000000C ecx,1 752E8A85 // System.String.Concat ecx,dword ptr ds:[01AF1070h] edx,esi dword ptr ds:[037B0010h] // System.Console.WriteLine pop ret esi : push pop esi edi 4. Execution JIT Optimizations • Register Allocation – locals, temps, evaluation stack • Loop unroll • Dead code elimination #define SOMETHING 0 if (SOMETHING > 10) a = x; // dead code statement • Constant and Copy propagation • Processor specific code generation 4. Execution JIT Optimizations • Range check elimination //Range check will be eliminated for (int i = 0; i < myArray.Length; i++) { Console.WriteLine(myArray[i].ToString()); } //Range check will NOT be eliminated for (int i = 0; i < myArray.Length - y; i++) { Console.WriteLine(myArray[i + x].ToString()); } Agenda 1. Compilation csc.exe /debug helloAtlanta.cs 2. Packaging and Metadata helloAtlanta.exe 4. Execution CLR 3. Loading and Layout mscorlib.dll 5. Runtime Services Q&A 5. Runtime services A look at the Garbage Collector • • • • Reference tracking (tracing) GC Large object heap – objects over 80k Generational (three gen) Mark sweep and compact Agenda 1. Compilation csc.exe /debug helloAtlanta.cs 2. Packaging and Metadata helloAtlanta.exe 4. Execution CLR 3. Loading and Layout mscorlib.dll 5. Runtime Services Q&A More Information • My Blog: http://blogs.msdn.com/brada The SLAR Inside MS .NET IL Assembler Common Language Infrastructure Annotated Standard Shared Source CLI Essentials Applied Microsoft .NET Framework Programming Compiling for the .NET CLR Questions? BACKUP New runtime features for V2.0 • • • • • • • • • • Generics 64 bit (Itanium and x86-64) ReflectionOnly context Delegate Relaxation Lightweight Code Generation NGen/NGen services Stub based dispatch MDbg Edit and Continue BCL Enhancements Generational GC Generation 1 Generation 0 New Heap Begins with New Generation Accessible References Keep Objects Alive Preserves / Compacts Referenced Objects Objects Left Merge with Older Generation New Allocations Rebuild New Generation Generational GC Generation 2 Generation 0 • Generations Dynamically Tuned – CPU Cache Size – Acceptable Fragmentation • Older Generations are Larger / More Stable – Require Collection Less Often – Are More Expensive to Collect – Can Have References to Newer Objects What is Generics? • • • • • • Type checking, no boxing, no downcasts Increased sharing (typed collections) Instantiated at run-time, not compile-time Work for both reference and value types Code shared for reference types Exact run-time type information What is Generics? Generic Type Declaration public class List<T> { public void Add(T item) { … } } Swap<T>(ref Generic Method public static void ref2){ … } Type Parameter public class Dictionary<K,V> { T ref1, ref T } Type Argument Dictionary<string,int> map = new Dictionary<string,int>(); where T:IComparable, T:new() Constraints public class List<T> Open Constructed Type List<T> Closed Constructed Type List<string> C# Generics public class List<T> { private T[] elements; private int count; public void Add(T element) { if (count == elements.Length) Resize(count * 2); elements[count++] = element; } public T this[int index] { get { return elements[index]; } List<int> intList = new List<int>(); set { elements[index] = value; } } intList.Add(1); // No boxing intList.Add(2); // No boxing public int Count { intList.Add("Three"); // Compile-time error get { return count; } } int i = intList[0]; // No cast required } Bits & bytes under the hood • New metadata tables GenericParam GenericPar0x2a MethodSpec GenericPar 0x2b GenericParamConstraint GenericPar 0x2c Number Method (MethodDefOrRef) Number Number Owner Flags Instantiation (Blob heap) Constraint (TypeDefRefSpec) Owner Owner (TypeOrMethodDef) (TypeOrMethodDef) Name (sh) Constraint (TypeDefOrRef) • Flags defines variance and special constraints • GenericParamConstraint defines type constraints Implementation choices • Code sharing – Modula 3 and ML use code sharing – Type identity can be a problem – Runtime type checking required • Code specialization – C++ templates uses specialization – Code bloat can be a problem 4. Execution JIT Optimizations • JIT Inlining (method example) Class Fib Shared Sub NextFib (ByRef i As Integer, ByRef j As Integer) Dim k As Integer = i + j i = j : j = k End Sub Shared Dim Dim For Sub Main() fib1 As Integer = 1 : Dim fib2 As Integer = 1 n As Integer n = 3 To 36 NextFib (fib1, fib2) Next n End Sub End Class 4. Execution JIT No inline == SLOW push ebp sub esp,0Ch : mov ebp,esp : push esi Prolog xor mov mov xor eax,eax [ebp-4],eax [ebp-8],eax esi,esi : nop Zero vars mov mov mov dword ptr [ebp-4],1 dword ptr [ebp-8],1 esi,3 Init vars lea ecx,[ebp-4] : lea edx,[ebp-8] call [003E50D4h] ====================> nop : nop add esi,1 : jno 00000009 Call NextFib xor ecx,ecx : call 764618ED n++ Overflow Handler cmp : jle Next n esi,24h nop mov esp,ebp FFFFFFE3 : nop : pop esi : pop ebp : ret Epilog 4. Execution NextFib() method push mov sub push push push mov mov xor nop mov add jno xor ebp ebp,esp esp,0Ch edi esi ebx edi,ecx esi,edx ebx,ebx eax,[edi] eax,[esi] 00000009 ecx,ecx call mov mov mov mov nop pop pop pop mov pop ret 764618B7 ebx,eax eax,[esi] [edi],eax [esi],ebx ebx esi edi esp,ebp ebp 4. Execution JIT Inlining == FAST! sub esp,8 push esi : push edi Prolog xor mov mov eax,eax [esp+8],eax [esp+C],eax Zero vars mov mov mov [esp+8],1 [esp+0C],1 edi,3 Init vars lea mov add mov esi,[esp+8] eax,[esi] eax,edx [esi],edx : : : : add cmp edi,1 edi,24h : jo 0D : jle FFFFFFE4 Next n pop add esi esp,8 : pop edi : ret Epilog lea mov jo mov ecx,[esp+C] edx,[ecx] 16 [ecs],eax Calc next