Wednesday 1 February 2012

C# Optimization Secrets

You want to see ways you can add performance optimizations to your C# programs, focusing on the level of the code statements and methods. While high-level considerations, and factors external to your code are often most important, such as computer processor and network speed, there are many low-level performance optimizations you can do inside the C# language that can improve performance. We describes these optimizations.
Tip: Focus on the "hot paths" in your program for optimizations.

Overview

In this overview, we describe the general considerations when optimizing your C# code. First, the C# language is compiled, and with the .NET Framework, you can attain performance close to languages such as C or C++.
Generally, using the simplest features of the language provides the best performance; for example, using the for-loop and avoiding parameters and return values is typically fastest. You must balance these performance goals with code readability and understandability.

Benchmark

At all levels of performance optimization, you should be taking measurements on the changes you make to methods. You can do this with the .NET Framework methods available in the Stopwatch type. It often pays to create a multitude of console programs where the methods are benchmarked repeatedly on data as it changes. You should always avoid regressing performance unless there is a clear reason to do so.

Static methods

In the C# language, non-inlined instance methods are always slower than non-inlined static methods. The reason for this is that to call an instance method, the instance reference must be resolved, to determine what method to call. Static methods do not use an instance reference.
If you look at the intermediate language, you will see that static methods can be invoked with fewer instructions. You can see an experiment based on the callvirt and call instructions on this site.

Avoid parameters

When you call any method in the C# language that was not inlined, the runtime will actually physically copy the variables you pass as arguments to the formal parameter slot memory in the called method. This causes stack memory operations and incurs a performance hit. It is faster to minimize arguments, and even use constants in the called methods instead of passing them arguments.

Avoid local variables

When you call a method in your C# program, the runtime allocates a separate memory region to store all the local variable slots. This memory is allocated on the stack even if you do not access the variables in the function call. Therefore, you can call methods faster if they have fewer variables in them.
One way you can do this is isolate rarely used parts of methods in separate methods. This makes the fast path in the called method more efficient, which can have a significant performance gain.

Constants

In the .NET Framework, constants are not assigned a memory region, but are instead considered values. Therefore, you can never assign a constant, but loading the constant into memory is more efficient because it can injected directly into the instruction stream. This eliminates any memory accesses outside of the memory, improving locality of reference. The performance advantage of const fields is demonstrated on this site.

Static fields

Static fields are faster than instance fields, for the same reason that static methods are faster than instance methods. When you load a static field into memory, you do not need the runtime to resolve the instance expression. Loading an instance field must have the object instance first resolved. Even in an object instance, loading a static field is faster because no instance expression instruction is ever used. Please review the article on this topic.

Inline methods

Unlike the C++ language, the C# language does not allow you to suggest a method be inlined into its enclosing method call spots. Often, the .NET Framework is conservative here and will not inline medium-sized or large methods. However, you can manually paste a method body into its call spot.
Typically, this improves performance in micro-benchmarks, and it is really easy to do. However, it will make code harder to modify; it is only suggested for a very few, critical spots in programs.

Switch

You will find that the switch statement compiles in a different way than if-statements typically do. For example, if you use a switch on an int, you will often get jump statements, which are similar to a computed goto mechanism. Using jump tables makes switches much faster than some if-statements; please see the pertinent article for more details. Also, using a char switch on a string is very fast.

Flattened arrays

Using two-dimensional arrays in C# is relatively slow. However, you can explicitly create a one-dimensional array and access it through arithmetic that supposes it is a two-dimensional array. This is sometimes called flattening an array. You must use multiplication and addition to acquire the correct element address. Typically, this optimization will improve the performance of accessing any array, and it is used extensively on this site.

Jagged arrays

While flattened arrays are typically most efficient, they are sometimes very impractical. In these cases, you can use jagged arrays to improve the lookup performance. The .NET Framework enables faster accesses to jagged arrays than to 2D arrays. Please note that jagged arrays may cause slower garbage collections, because each jagged array element will be treated separately by the garbage collector.

StringBuilder

If you are doing significant appending of strings using the C# language, the StringBuilder type can improve performance. This is because the string type is immutable and can not be changed without reallocating the entire object. Sometimes, using strings instead of StringBuilder for concatenations is faster; this is typically the case when using very small strings or doing infrequent appends.

Char arrays

Using char arrays in your C# code is sometimes the fastest way to build up a string. Typically, you will combine char arrays with for-loops and character testing expressions. This logic is more painful to develop and test, but the time savings can be very significant, making certain routines more than ten times faster, while reducing memory allocations as well.

Byte arrays

In the C# language, the smallest unit of addressable storage is the byte type. You can store ASCII characters in a single byte, as well as small numbers. If you can store your data in an array of bytes, this allows you to save memory. For example, an array of characters or a string uses two bytes per character; an array of bytes can represent that data in one byte per character, result in about half the total memory usage.

Arrays

In the .NET Framework, you have many options for collections, such as the List type, and various other types such as ArrayList. While these types are convenient and should be used when necessary, it is always more efficient to use a simple array if this is possible.
The reason for this is that the more complex collections such as List are actually composed of internal arrays. They add logic to avoid the burden of managing the array size on each use. However, if you do not need this logic, or can adjust your code so that the logic is not needed, using an array will be faster.

Capacities

For collections in the .NET Framework and C# language, you can use an optional capacity argument to influence the intial buffer sizes. It is best to pass a reasonable parameter in most cases when creating a collection such as a Dictionary or List. This avoids many allocations when adding elements that were not anticipated. Please see the pertinent article on Dot Net Perls for details.

Rewrite loops

Here, we describe ways that you can rewrite the loops in your C# programs to improve performance. While the foreach loop can have good performance in many cases, it is best to use the for-loop in all performance-critical sections when possible. The reason for this is that not only do for-loops sometimes have better raw performance, you can often reuse the index variable (induction variable) to optimize other parts of the loop or method.
Typically, the while loop, the for loop and the do-while loop have the best performance. Also, it is sometimes beneficial—and sometimes harmful—to "hoist" the maximum loop variable outside of the for-loop statement.

 Consider structs
Unless you know more about the C# language than I do, it is typically best to avoid structs entirely. If you use structs, you must be careful to not pass the struct as a parameter to methods often, or performance will degrade to worse than using a class type. The reason for this is that structs are copied in their entirety on each function call or return value.
Structs can improve the performance of the garbage collector by reducing the number of distinct objects. Also, you can sometimes use separate arrays instead of arrays of structs, which can improve performance further.

Lookup tables

While switch statements or hashtables such as Dictionary in the C# language can provide good performance, using a lookup table is frequently the optimal choice. For example, instead of testing each character using logic when lowercasing a string, you can translate each character through a lookup table. The lookup table can be implemented as a character array. Another example is that you can implement the ROT13 algorithm with a lookup table, improving performance by more than two times.

Char argument

Often, you may need to pass a single character to a certain method in your programs. For example, the StringBuilder type allows you to append a single char; the Response.Write method also allows you to write a single char. It is more efficient to pass a char instead of a single-char string. The char is a value type, and is represented by two bytes, while a string is a reference type and requires over 20 bytes. This site contains an exploration of StringBuilder char argument performance.

Avoid ToString

In this tip, we assert that it is poor programming style to use the ToString method unnecessarily. Sometimes, developers will call ToString on a character in a string, and then test it against a single-character string literal. This is grossly inefficient; instead, use a character testing expression with two chars.
Please reference the specific article on this topic for more details here. The article shows this mistake results in code that is ten times slower than the correct approach.

Int string cache

Many C# programs use the ToString method on integer values frequently. Unfortunately, this requires an allocation on the managed heap for the new string. This will cause the next garbage collection to become slower. You can actually use a lookup table to optimize common cases for the integer ToString operation. This site demonstrates how this lookup table can make the ToString method thirty times faster.

IL Disassembler

For .NET development, you should be opening your methods with the IL Disassembler tool provided by Microsoft. This is a free tool and it provides an interface for you to view the MSIL (Microsoft Intermediate Language) output of all your compiled Release executables. It is sometimes useful to save copies of the intermediate language as you make changes, or to even count instructions.

Avoid sorting

Often, you can avoid performing a sort operation on an array or string simply by testing whether the input string or array is already sorted. Sometimes, this makes a big performance improvement. In other cases, this slows down your programs. Please see the article about checking alphabetical characters for more information.

Avoid string conversions

In this optimization tip, we note that you can actually avoid many string-based conversions. For example, you may need to ensure that a string is lowercased. If the string is already lowercase, you can avoid allocating a new string entirely. However, the framework ToLower method will not avoid this for you; you must manually test to see if no lowercasing is necessary, as with a for-loop over the characters.

Avoid Path methods

Unfortunately, the Path methods in the System.IO namespace are somewhat slow for many applications. Sometimes they can cause unnecessary allocations to occur, copying strings more than once. You can sometimes use character-based algorithms to minimize allocations, improving performance by nearly three times.

Dictionary

It is important that you use hashtables in your programs when appropriate. The Dictionary collection in the .NET Framework is not optimal in many cases, but provides good performance in many different situations. While we assume a fundamental knowledge of algorithms and searching here, the Dictionary is an essential tip in any performance article.

Read this site

The site you are reading, Dot Net Perls, contains a multitude of optimization experiments, often proven with benchmarks that provide times in nanoseconds per method call. Resources such as this site can be invaluable for certain tasks in programming; before Dot Net Perls came about, no such site had this information on optimization.

Compiler theory

While experimentation such as benchmarking and analyzing instructions generated can result in excellent program performance, without understanding the core theories of compilers you may be lacking knowledge about program performance. Unfortunately, compiler theory involves a great deal of advanced mathematics and can be very dense to start with.
My observation is that only a tiny minority of application developers have a significant knowledge of compiler theory; this topic may be more suitable to academic computer scientists and not rapid application development programmers. A good book on this subject is the dragon book.

Temporal locality

Another way you can optimize a program significantly is by rearranging it to increase temporal locality. This means that methods that act on a certain part of memory (such as the hard disk) are run at all once. You can find out more about this optimization here.

Misnomer

The term optimization is actually a misnomer in computer science. A program can never be truly optimized. Because compiler theory is undecidable, a program can never be proven to be optimally efficient—perhaps another approach is faster?

 Resources

There are many pages on this website that are focused on optimization tips. These pages are listed below; most of them show how you can rewrite a certain pattern of code to something arguably more efficient. Please be aware some of these optimizations result in code that is less maintainable.



















No comments:

Post a Comment