.NET Compact Framework 2.0 Optimizing For Performance

Download Report

Transcript .NET Compact Framework 2.0 Optimizing For Performance

.NET Compact Framework 2.0
Optimizing For Performance
Roman Batoukov
FUN403
Development Lead
.NET Compact Framework
Microsoft Corporation
1
.NET Compact Framework
Visual
Studio
Rich class libraries to make your
life easy!
FX
GUI: Forms
GUI: Drawing (2D & 3D)
Collections
IO, Networking, Crypto
Native interop
Web services
Data & Xml
Globalization
Execution Engine provides typesafe
Runtime for managed code
CLR
Type system
Loader
JIT Compiler
Garbage collector
Debugger
Low level operating system-specific
functionality
Threads
Memory
Windows CE
Networking
File I/O
2
.NET Compact Framework
How we are different?
Memory constraints
Storage – Flash/ROM
Physical Memory
Virtual Memory – 32MB per process
Design
28% of the surface area in 8% of the size of full .NET
Framework
Portable JIT Compiler
Fast code generation, less optimized
May pitch JIT-compiled code
No NGen, install time or persisted code
Interpreted virtual calls (no v-tables)
Sparse loading of metadata
3
Measuring Performance
Overview
Micro-benchmarks versus Scenarios
Benchmarking tips
Use Environment.TickCount to measure
Measure times greater than 1 second
Start from known state
Ensure nothing else is running
Measure multiple times, take average
Run each test in own AppDomain/Process
Log results at the end
Understand JIT-time versus run-time cost
4
.NET Compact Framework
.NET Compact Framework Performance v1->v2
(Pocket PC 2003, XScale 400MHz)
Bigger
is better
Smaller
is better
1.0
1.0 SP3
V2
Beta1
V2
current
Method Calls (Calls/sec)
3.7M
7.1M
8.1M
Virtual Calls (Calls/sec)
2.4M
2.7M
5.6M
Simple P/Invoke (Calls/sec)
733K
1.7M
1.8M
Primes (to 1500) (iterations/sec)
562
832
855
GC Small (8 bytes) (Bytes/sec)
1M
7M
7.5M
GC Array (100 int’s) (Bytes/sec)
25M
43M
115M
XML Text Reader 200KB (seconds)
1.7
1.2
0.72
0.69
DataSet (static data)
4 tables, 1000 records (seconds)
13.1
6.6
7.3
4.0
DataSet (ReadXml)
3 tables, 100 records (seconds)
12.3
6.5
5.2
3.9
5
Measuring Performance
Performance counters
<My App>.stat (formerly mscoree.stat)
http://msdn.microsoft.com/library/enus/dnnetcomp/
html/netcfperf.asp
Registry
HKLM\SOFTWARE\Microsoft\.NETCompactFramework\PerfMonitor
Counters (DWORD) = 1
What does .stat tell you?
Working set and performance statistics
More counters added in v2
Generics usage
COM interop usage
Number of boxed valuetypes
Threading and timers
GUI objects
Network activity (socket bytes send/received)
6
.stat
counter
Total Program Run Time (ms)
App Domains Created
App Domains Unloaded
Assemblies Loaded
Classes Loaded
Methods Loaded
Closed Types Loaded
Closed Types Loaded per Definition
Open Types Loaded
Closed Methods Loaded
Closed Methods Loaded per Definition
Open Methods Loaded
Threads in Thread Pool
Pending Timers
Scheduled Timers
Timers Delayed by Thread Pool Limit
Work Items Queued
Uncontested Monitor.Enter Calls
Contested Monitor.Enter Calls
Peak Bytes Allocated (native + managed)
Managed Objects Allocated
Managed Bytes Allocated
Managed String Objects Allocated
Bytes of String Objects Allocated
Garbage Collections (GC)
Bytes Collected By GC
Managed Bytes In Use After GC
Total Bytes In Use After GC
GC Compactions
Code Pitchings
Calls to GC.Collect
GC Latency Time (ms)
Pinned Objects
Objects Moved by Compactor
Objects Not Moved by Compactor
Objects Finalized
Boxed Value Types
Process Heap
Short Term Heap
JIT Heap
App Domain Heap
GC Heap
Native Bytes Jitted
Methods Jitted
Bytes Pitched
total
55937
18
18
323
18852
37353
730
730
78
46
46
0
46
0
46
57240
0
4024363
1015100
37291444
112108
4596658
33
25573036
17
6
0
279
156
73760
11811
6383
350829
7202214
26910
1673873
last datum
8
1
0
0
28
41592
23528
3091342
16
1626
0
0
0
0
152
0
n
385
40
6
93
1015100
33
33
33
33
430814
178228
88135
741720
376
26910
7047
mean
1
1
1
0
36
774940
259414
2954574
8
511970
718
357796
647240
855105
267
237
min
1
1
0
0
8
41592
23176
1833928
0
952
0
0
0
0
80
0
Peak Bytes Allocated (native + managed)
JIT Heap
App Domain Heap
GC Heap
Garbage Collections (GC)
GC Latency Time (ms)
Boxed Value Types
Managed String Objects Allocated
max
8
2
3
1
55588
1096328
924612
3988607
31
962130
21532
651663
833370
2097152
5448
5448
7
.NET Compact Framework
Redist
FX
MSI Setup
(ActiveSync)
Per Device CAB
Install (SMS, etc)
Globalization
Crypto
I/O
Net
GUI
System.
Globalization
System.
Cryptography
System.
IO.Ports
System.
WebServices
DirectX.
DirectD3DM
Microsoft.
Win32.Registry
System.Net.
Http*
Windows.
Forms
System.IO.
File
System.Net.
Sockets
System.
Drawing
File I/O
NTLM
Common
Controls
Registry
SSL
GDI/GWES
Sockets
D3DM
Microsoft.
VisualBasic
System.
Reflection
System
System.
Data
mscorlib
System.Xml
Debugger
JIT Compiler
& GC
Calendar
Data
Class
Loader
Assembly
Cache
Culture
Data
App Domain
Loader
Native
Interop
Process
Loader
Memory and
Threading
Visual Studio
Debug Engine
ICorDbg
Host
CLR
Sorting
Crypto API
Managed Loader
Cert/Security
File Mapping
Verification
Windows CE
Encodings
Casing
8
Common Language Runtime
Execution engine
Call path
Managed calls are more expensive than native
Instance call: ~2-3X the cost of a native function call
Virtual call: ~1.4X the cost of a managed instance call
Platform invoke: ~5X the cost of managed instance call (*Marshal int
parameter)
Properties are calls
JIT compilers
All platforms has the same optimizing JIT compiler architecture in
v2
Optimizations
Method inlining for simple methods
Variable enregistration
String interning
9
Common Language Runtime
Call path (sample)
public class Shape
{
protected int m_volume;
public virtual int Volume
{
get {return m_volume;}
}
}
public class Cube:Shape
{
public MyType(int vol)
{
m_volume = vol;
}
}
public class Shape
{
protected int m_volume;
public int Volume
{
get {return m_volume;}
}
}
public class Cube:Shape
{
public MyType(int vol)
{
m_volume = vol;
}
}
10
Common Language Runtime
Call path (sample)
public class MyCollection
{
private const int m_capacity = 10000;
private Shape[] storage = new Shape[m_capacity];
…
public void Sort()
{
callvirt instance int32 Shape::get_Volume()
Shape tmp;
for (int i=0; i<m_capacity-1; i++) {
for (int j=0; j<m_capacity-1-i; j++)
if (storage[j+1].Volume < storage[j].Volume){
tmp = storage[j];
storage[j] = storage[j+1];
storage[j+1] = tmp;
}
}
}
}
11
Common Language Runtime
Call path (sample)
public class Shape
57
{
protected int m_volume;
public virtual int Volume
{
get {return m_volume;}
}
}
public class Cube:Shape
{
public MyType(int vol)
{
m_volume = vol;
}
}
sec
~
public class Shape
39 sec
{
protected int m_volume;
public int Volume
{
get {return m_volume;}
}
}
Nopublic
virtualclass
callCube:Shape
overhead
{
Inlined
(no call
overhead
public
MyType(int
vol) at all)
{
Equal to
accessing
m_volume
= vol;field
}
}
12
Common Language Runtime
‘The Memory Bill’
Shared by all .NET applications running
.NET Compact Framework CLR DLLs
.NET assemblies (memory mapped)
Dynamic, per process memory costs
Objects allocated
Threads stacks
Number of classes and methods
Runtime representation of metadata
JIT compiled code
Unmanaged allocations (not under control of the CLR)
Operating System
Native DLLs called by application via P/Invoke
13
Common Language Runtime
Memory heaps
Five memory heaps to reduce fragmentation
App-domain
JIT
CLR dynamic representation of metadata
for the assembly loader
JIT compiled code buffers
Garbage Collector (GC) Application and Framework object
allocations
Short-term
CLR temporary/short lived allocation heap
Process
Other CLR allocations
14
Going Into The Background
Yahtzee game
Application goes
into background
or low on memory
2
1.8
Memory (MB)
1.6
1.4
1.2
GC Heap
1
JIT Heap
M anaged B ytes A llo cated
0.8
GC Heap Usage
0.6
0.4
0.2
0
0
25
50
75
100
Time (seconds)
125
150
15
Real World Measurements
Yahtzee game
Where
Shared,
RO
Demand
Paged
Code, .NET
Assemblies
CLR Heaps
Process
Memory
Application
Total
Total
Peak
‘On
Minimize’
1MB
500KB
1.7MB
1MB
JIT Heap
220KB
30KB
Process Heap
47KB
11KB
App Domain Heap
177KB
177KB
GC Heap – object allocations
1MB
64KB
1.5MB
282KB
4.2MB
1.7MB
What is the memory?
Mscoree.dll, mscoree2_0.dll,
netcfagl2_0.dll
Mscorlib, Yahtzee.exe, system,
System.Drawing,
System.Windows.Forms
16
Common Language Runtime
Garbage Collector (GC)
Managed allocations are FAST
7.5MB per sec (allocating 8 byte objects)
GC manages it’s own heap
Allocates 64KB blocks, 1MB cache
Use VirtualAlloc to enable release of virtual and
physical memory back to system
Compacts heap when fragmentation occurs
17
Common Language Runtime
Garbage Collector
What triggers a GC?
Memory allocation failure
1M of GC objects allocated (v2)
Application going to background
GC.Collect() (Avoid “helping” the GC!)
In general, if you don’t allocate objects, GC won’t occur
Beware of side-effects of calls that may allocate objects
What happens at GC time?
Freezes all threads at safe point
Finds all live objects and marks them
An object is live if it is reachable from root location
Unmarked objects are freed and added to finalizer queue
Finalizers are run on a separate thread
GC pools are compacted if required
Return free memory to the operating system
18
Common Language Runtime
Garbage Collector
GC Latency per collection
90
80
GC latency (ms)
70
60
50
40
30
20
10
0
0
100000
300000
500000
Number of Live Objects
19
Common Language Runtime
Garbage Collector
Allocation rate
160000
Allocation rate iter/sec
140000
120000
100000
80000
60000
40000
20000
0
400
4000
20000
40000
80000
Object size (bytes)
20
Common Language Runtime
Where garbage comes from?
Unnecessary string allocations
Strings are immutable
String manipulations (Concat(), etc.) cause copies
Use StringBuilder
http://weblogs.asp.net/ricom/archive/2003/12/02/40778.aspx
String result = "";
for (int i=0; i<10000; i++) {
result +=
".NET Compact Framework";
result += " Rocks!";
}
StringBuilder result =
new StringBuilder();
for (int i=0; i<10000; i++){
result.Append(".NET Compact
Framework");
result.Append(" Rocks!");
}
21
.stat
counter
Total Program Run Time (ms)
App Domains Created
App Domains Unloaded
Assemblies Loaded
Classes Loaded
Methods Loaded
Closed Types Loaded
Closed Types Loaded per Definition
Open Types Loaded
Closed Methods Loaded
Closed Methods Loaded per Definition
Open Methods Loaded
Threads in Thread Pool
Pending Timers
Scheduled Timers
Timers Delayed by Thread Pool Limit
Work Items Queued
Uncontested Monitor.Enter Calls
Contested Monitor.Enter Calls
Peak Bytes Allocated (native + managed)
Managed Objects Allocated
Managed Bytes Allocated
Managed String Objects Allocated
Bytes of String Objects Allocated
Garbage Collections (GC)
Bytes Collected By GC
Managed Bytes In Use After GC
Total Bytes In Use After GC
GC Compactions
Code Pitchings
Calls to GC.Collect
GC Latency Time (ms)
Pinned Objects
Objects Moved by Compactor
Objects Not Moved by Compactor
Objects Finalized
Boxed Value Types
Process Heap
Short Term Heap
JIT Heap
App Domain Heap
GC Heap
Native Bytes Jitted
Methods Jitted
Bytes Pitched
Methods Pitched
Method Pitch Latency Time (ms)
Exceptions Thrown
Platform Invoke Calls
total
11843
1
1
2
175
198
0
0
0
0
0
0
1
0
1
2
0
3326004
60266
5801679432
20041
5800480578
4912
5918699036
0
0
0
686
0
0
0
1
3
22427
98
0
0
0
0
last datum
0
0
0
0
28
1160076
580752
1810560
0
278
0
0
0
0
140
0
0
-
n
0
0
2
2
60266
4912
4912
4912
4912
235
278
360
1341
35524
98
0
0
-
mean
0
0
0
0
96267
1204946
381831
1611885
0
2352
986
12103
46799
2095727
228
0
0
-
min
0
0
0
0
8
597824
8364
1097856
0
68
0
0
0
0
68
0
0
-
Run time 173 sec
max
0
0
1
1
580020
1572512
580752
1810560
16
8733
10424
24444
64562
3276800
1367
0
0
-
String result = "";
for (int i=0; i<10000; i++) {
result += ".NET Compact Framework";
result += " Rocks!";
}
Managed String Objects Allocated
Garbage Collections (GC)
Bytes of String Objects Allocate
Bytes Collected By GC
GC latency
0
-
-
-
-
20040
4912
5,800,480,574
5,918,699,036
107128 ms
-
22
.stat
counter
Total Program Run Time (ms)
App Domains Created
App Domains Unloaded
Assemblies Loaded
Classes Loaded
Methods Loaded
Closed Types Loaded
Closed Types Loaded per Definition
Open Types Loaded
Closed Methods Loaded
Closed Methods Loaded per Definition
Open Methods Loaded
Threads in Thread Pool
Pending Timers
Scheduled Timers
Timers Delayed by Thread Pool Limit
Work Items Queued
Uncontested Monitor.Enter Calls
Contested Monitor.Enter Calls
Peak Bytes Allocated (native + managed)
Managed Objects Allocated
Managed Bytes Allocated
Managed String Objects Allocated
Bytes of String Objects Allocated
Garbage Collections (GC)
Bytes Collected By GC
Managed Bytes In Use After GC
Total Bytes In Use After GC
GC Compactions
Code Pitchings
Calls to GC.Collect
GC Latency Time (ms)
Pinned Objects
Objects Moved by Compactor
Objects Not Moved by Compactor
Objects Finalized
Boxed Value Types
Process Heap
Short Term Heap
JIT Heap
App Domain Heap
GC Heap
Native Bytes Jitted
Methods Jitted
Bytes Pitched
Methods Pitched
Method Pitch Latency Time (ms)
Exceptions Thrown
Platform Invoke Calls
total
11843
1
1
2
175
198
0
0
0
0
0
0
1
0
1
2
0
3326004
60266
5801679432
20041
5800480578
4912
5918699036
0
0
0
686
0
0
0
1
3
22427
98
0
0
0
0
last datum
0
0
0
0
28
1160076
580752
1810560
0
278
0
0
0
0
140
0
0
-
n
0
0
2
2
60266
4912
4912
4912
4912
235
278
360
1341
35524
98
0
0
-
mean
0
0
0
0
96267
1204946
381831
1611885
0
2352
986
12103
46799
2095727
228
0
0
-
min
0
0
0
0
8
597824
8364
1097856
0
68
0
0
0
0
68
0
0
-
Run time 0.1 sec
max
0
0
1
1
580020
1572512
580752
1810560
16
8733
10424
24444
64562
3276800
1367
0
0
-
StringBuilder result = new StringBuilder();
for (int i=0; i<10000; i++){
result.Append(".NET Compact
Framework");
result.Append(" Rocks!");
}
Managed String Objects Allocated
Bytes of String Objects Allocated
Garbage Collections (GC)
Bytes Collected By GC
GC Latency
0
-
-
-
-
56
2097718
2
1081620
21 ms
-
23
Common Language Runtime
Where garbage comes from?
Unnecessary boxing
Value types allocated on the stack
(fast to allocate)
Boxing causes a heap allocation and a copy
Use strongly typed arrays and collections
(Framework collections are NOT strongly typed)
class Hashtable {
struct bucket {
Object key;
Object val;
}
bucket[] buckets;
public Object this[Object key] { get; set; }
}
24
Common Language Runtime
Sample Code: Value Types and boxing
public struct AccountId {
public int m_number;
public override int GetHashCode() { return m_number; }
}
public struct AccountData {
private int m_balance;
public int Balance {
get {return m_balance;}
set {m_balance=value;}
}
}
public class Accounts {
public class Accounts {
public const int num = 10000;
public const int num = 10000;
Object[] accounts = new
AccountData[] accounts = new
Object[num];
AccountData[num];
public Object this[Object id] {
public AccountData this[AccountId id] {
get {return
get {return
accounts[id.GetHashCode()];}
accounts[id.GetHashCode()];}
set {accounts[id.GetHashCode()] =
set {accounts[id.GetHashCode()] =
value;}
value;}
}
}
}
}
25
Common Language Runtime
Sample Code: Value Types and boxing
Accounts ac = new Accounts(); int i;
for (i = 0; i < Accounts.num_accounts; i++) {
AccountData rec = new AccountData();
rec.Balance = 100;
AccountId id; id.m_number = i;
ac[id] = rec;
}
long iterations = 0;
long start = Environment.TickCount;
do {
for (i = 0; i < Accounts.num_accounts; i++) {
AccountId id; id.m_number = i;
AccountData rec = (AccountData)ac[ id ];
rec.Balance-=10;
ac[ id ]=rec;
}
iterations += i;
} while (Environment.TickCount - start < 1000);
26
Common Language Runtime
Sample Code: Value Types and boxing
0.15M iter/sec
Boxed value types
Garbage Collections (GC)
Bytes Collected By GC
GC Latency Time
2.5M iter/sec
4138460
4
4138460
132 ms
public class Accounts {
public const int num = 10000;
Object[] accounts = new
Object[num];
public Object this[Object id] {
get {return
accounts[id.GetHashCode()];}
set {accounts[id.GetHashCode()] =
value;}
}
}
Boxed value types
Garbage Collections (GC)
Bytes Collected By GC
GC Latency Time
2
0
0
0 ms
public class Accounts {
public const int num = 10000;
AccountData[] accounts=new
AccountData[num];
public AccountData this[AccountId id] {
get {return
accounts[id.GetHashCode()];}
set {accounts[id.GetHashCode()] =
value;}
}
}
27
Common Language Runtime
Sample Code: Generics
public class Accounts<U, V>
{
public const int num_accounts = 10000;
private U[] accounts = new U[num_accounts];
public U this[V id] {
get {return accounts[id.GetHashCode()];}
set {accounts[id.GetHashCode()] = value;}
}
}
Accounts<AccountData, AccountId> ac = new Accounts<AccountData, AccountId>();
int i;
for (i = 0; i < Accounts<AccountData, AccountId>.num_accounts; i++) {
AccountData rec = new AccountData(); rec.Balance = 100;
AccountId id; id.m_number = i;
ac[id] = rec;
}
long iterations = 0; long start = Environment.TickCount;
do {
for (i = 0; i < Accounts<AccountData, AccountId>.num_accounts; i++) {
AccountId id; id.m_number = i; AccountData rec = (AccountData)ac[id];
rec.Balance-=10; ac[id]=rec;
}
iterations += i;
} while (Environment.TickCount - start < 1000);
28
Common Language Runtime
Sample Code: Generics
public class Accounts<U, V>
{
Untyped
0.15M iter/sec
public const int num_accounts = 10000;
private U[] accounts
= new typed
U[num_accounts];
Strongly
2.5M iter/sec
public U this[V id] {
Generic
2.5M iter/sec
get {return accounts[id.GetHashCode()];}
set {accounts[id.GetHashCode()] = value;}
}
}
Accounts<AccountData, AccountId> ac = new Accounts<AccountData, AccountId>();
int i;
for (i = 0; i < Accounts<AccountData, AccountId>.num_accounts; i++) {
AccountData rec = new AccountData(); rec.Balance = 100;
AccountId id; id.m_number = i;
ac[id] = rec;
}
long iterations = 0; long start = Environment.TickCount;
do {
for (i = 0; i < Accounts<AccountData, AccountId>.num_accounts; i++) {
AccountId id; id.m_number = i; AccountData rec = (AccountData)ac[id];
}
iterations += i;
} while (Environment.TickCount - start < 1000);
rec.Balance-=10; ac[id]=rec;
29
.stat
counter
Total Program Run Time (ms)
App Domains Created
App Domains Unloaded
Assemblies Loaded
Classes Loaded
Methods Loaded
Closed Types Loaded
Closed Types Loaded per Definition
Open Types Loaded
Closed Methods Loaded
Closed Methods Loaded per Definition
Open Methods Loaded
Threads in Thread Pool
Pending Timers
Scheduled Timers
Timers Delayed by Thread Pool Limit
Work Items Queued
Uncontested Monitor.Enter Calls
Contested Monitor.Enter Calls
Peak Bytes Allocated (native + managed)
Managed Objects Allocated
Managed Bytes Allocated
Managed String Objects Allocated
Bytes of String Objects Allocated
Garbage Collections (GC)
Bytes Collected By GC
Managed Bytes In Use After GC
Total Bytes In Use After GC
GC Compactions
Code Pitchings
Calls to GC.Collect
GC Latency Time (ms)
Pinned Objects
Objects Moved by Compactor
Objects Not Moved by Compactor
Objects Finalized
Boxed Value Types
Process Heap
Short Term Heap
JIT Heap
App Domain Heap
GC Heap
Native Bytes Jitted
Methods Jitted
Bytes Pitched
Methods Pitched
Method Pitch Latency Time (ms)
Exceptions Thrown
Platform Invoke Calls
total
11843
1
1
2
175
198
0
0
0
0
0
0
1
0
1
2
0
3326004
60266
5801679432
20041
5800480578
4912
5918699036
0
0
0
686
0
0
0
1
3
22427
98
0
0
0
0
last datum
0
0
0
0
28
1160076
580752
1810560
0
278
0
0
0
0
140
0
0
-
n
0
0
2
2
60266
4912
4912
4912
4912
235
278
360
1341
35524
98
0
0
-
Boxed value types
Garbage Collections (GC)
Bytes Collected By GC
GC Latency Time
mean
0
0
0
0
96267
1204946
381831
1611885
0
2352
986
12103
46799
2095727
228
0
0
-
2
0
0
0 ms
Closed Types Loaded
Closed Types per definition
0
-
min
0
0
0
0
8
597824
8364
1097856
0
68
0
0
0
0
68
0
0
-
-
max
0
0
1
1
580020
1572512
580752
1810560
16
8733
10424
24444
64562
3276800
1367
0
0
-
1
mean=1 max=1
-
-
-
30
Common Language Runtime
Generics
Strong typing without code duplication
Fully specialized implementation in .NET
Compact Framework v2
Pros
Always strongly typed
No unnecessary boxing and type casts
Specialized code is more efficient than shared
Cons
Internal execution engine data structures and JIT-compiled
code aren’t shared
List<int>, List<string>, List<MyType>
http://blogs.msdn.com/romanbat/archive/2005/01/06/3
48114.aspx
31
Common Language Runtime
Finalization and Dispose
Cost of finalizers
Non-deterministic cleanup
Extends lifetime of object
In general, rely on GC for automatic memory cleanup
The exceptions to the rule…
If your object contains an unmanaged resource that the GC is
unaware of, you need to implement a finalizer
Also implement Dispose pattern to release unmanaged resource in
deterministic manner
Dispose method should suppress finalization (FxCop rule)
If the object you are using implements Dispose, call it when you
are done with the object
‘Objects Finalized’ performance counter
32
Common Language Runtime
Exceptions
Exceptions are cheap…until you throw
Throw exceptions in exceptional circumstances
Do not use exceptions for normal flow control
Use performance counters to track the number of
exceptions thrown
Replace “On Error/Goto” with “Try/Catch/Finally”
in Microsoft Visual Basic .NET
33
Common Language Runtime
Reflection
Reflection can be expensive
Reflection performance cost
Type comparisons (for example: typeof() )
Member access (for example: Type.InvokeMember())
Think ~10-100x slower
Working set cost
Type and Member enumerations (for example: Assembly.GetTypes(),
Type.GetMethods())
Runtime data structures
Think ~100 bytes per loaded type, ~80 bytes per loaded method
Be aware of APIs that use reflection as a side effect
Override
Object.ToString()
GetHashCode() and Equals() (for value types)
34
Common Language Runtime
Building a Cost Model for Managed Math
Math performance
32 bit integers: Similar to native math
64 bit integers: ~5-10X cost of native math
Floating point: Similar to native math
ARM processors do not have FPU
35
.NET Compact Framework
Redist
FX
MSI Setup
(ActiveSync)
Per Device CAB
Install (SMS, etc)
Globalization
Crypto
I/O
Net
GUI
System.
Globalization
System.
Cryptography
System.
IO.Ports
System.
WebServices
DirectX.
DirectD3DM
Microsoft.
Win32.Registry
System.Net.
Http*
Windows.
Forms
System.IO.
File
System.Net.
Sockets
System.
Drawing
File I/O
NTLM
Common
Controls
Registry
SSL
GDI/GWES
Sockets
D3DM
Microsoft.
VisualBasic
System.
Reflection
System
System.
Data
mscorlib
System.Xml
Debugger
JIT Compiler
& GC
Calendar
Data
Class
Loader
Assembly
Cache
Culture
Data
App Domain
Loader
Native
Interop
Process
Loader
Memory and
Threading
Visual Studio
Debug Engine
ICorDbg
Host
CLR
Sorting
Crypto API
Managed Loader
Cert/Security
File Mapping
Verification
Windows CE
Encodings
Casing
36
Base Class Library
Collections
Pre-size collection classes appropriately
Default capacity is small (for example 4 for
ArrayList)
Resizing creates unnecessary copies
Avoid unnecessary boxing and type casts
– use generic collections
Full support for all generic collections in the
.NET Compact Framework v2!
37
Windows Forms
Best Practices
Load and cache Forms in the background
Populate data separate from Form.Show()
Pre-populate data, or
Load data async to Form.Show()
Use BeginUpdate/EndUpdate when it is available
e.g. ListView, TreeView
Use SuspendLayout/ResumeLayout when repositioning
controls
Keep event handling code tight
Process bigger operations asynchronously
Blocking in event handlers will affect UI responsiveness
Form load performance
Reduce the number of method calls during initialization
38
Graphics And Games
Best Practices
Compose to off-screen buffers to minimize
direct to screen blitting
Approximately 50% faster
Avoid transparent blitting in areas that
require performance
Approximate 1/3 speed of normal blitting
Consider using pre-rendered images vs
using System.Drawing rendering primitives
Need to measure on a case-by-case basis
39
XML
Best Practices for Managing Large XML Data Files
Use XMLTextReader/XMLTextWriter
Smaller memory footprint than using XmlDocument
XmlTextReader is a pull model parser which only reads a “window”
of the data
XmlDocument builds a generic, untyped object model using a tree
Type stored as string
OK to use with smaller documents (64K XML: ~0.25s)
Optimize the structure of XML document
Use elements to group (allows use of Skip() in XmlReader)
Use attributes to reduce size - processing attribute-centric
documents is faster
Keep it short! (attribute and element names)
Avoid gratuitous use of white space
Use XmlReader/XmlWriter factory classes to create
optimized reader or writer
Applying proper XMLReaderSettings can improve performance
40
Data
Business logic
and presentation
In-memory
data copy
Business
logic
Custom data
Structures
arrays, collections
Serialization Custom
binary
File system
Transports
XmlSerializer
Binary or
text file
Active
Sync
Remote system
DataSet
controls
XmlDocument
IXmlSerializable
SQL
Server
Mobile
XML
file
HTTP Sockets MSMQ
Other
Data
Sources
GUI
Other
DB
Replication
Or RDA
Data
Adapters
Data
Readers
Web
services
SQL
DB
41
Data
Business logic
and presentation
In-memory
data copy
Custom data
Structures
arrays, collections
Serialization Custom
binary
File system
Business
logic
Binary or
text file
DataSet
GUI
controls
XmlDocument
SqlCEResultSet
DataReader
XmlSerializer
XML
file
IXmlSerializable
SQL
Server
Mobile
42
Data
Business logic
and presentation
In-memory
data copy
Business
logic
Custom data
Structures
arrays, collections
Serialization Custom
binary
File system
Transports
XmlSerializer
Binary or
text file
Active
Sync
Remote system
DataSet
controls
XmlDocument
IXmlSerializable
SQL
Server
Mobile
XML
file
HTTP Sockets MSMQ
Other
Data
Sources
GUI
Other
DB
Replication
Or RDA
Data
Adapters
Data
Readers
Web
services
SQL
DB
43
Web Services
Where is a bottleneck
Are you network bound or CPU bound?
Use perf counters: socket bytes sent / received. Do
you come close to the network capacity?
If you are network bound - work on reducing the size of
the message
Create a “canned” message, send over HTTP.
Compare performance with the web service.
If you are CPU bound, optimize the serialization scheme
for speed
http://blogs.msdn.com/mikezintel/archive/2005/03/30/
403941.aspx
44
Moving Forward
More tools
Live Remote Performance Counters
Under construction
Allocation profiler (CLR profiler)
Call profiler
Working set improvements
More speed
45
Summary
Make performance a requirement
and measure
Understand the APIs
Avoid unnecessary object allocation and
copies due to
String manipulations
Boxing
Not pre-sized collections
Understand data access performance
bottlenecks
46
Community Resources
At PDC
ILL03 Intelligent Data Synchronization in a Semi-Connected
Environment
ILL04 Write Once, Display Anywhere: UI for Windows Mobile
Devices
TLN316 Windows Mobile: New Emulation Technology for
Building Mobile Applications with Visual Studio 2005
After PDC
MSDN dev center: http://msdn.microsoft.com/mobility/
.NET Compact Framework Team Blog:
http://blogs.msdn.com/netcfteam/
.NET Compact Framework Performance FAQ:
http://blogs.msdn.com/netcfteam/archive/2005/05/04/4
14820.aspx
47
Your Feedback
is Important!
Please Fill Out a Survey
48
© 2005 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
49