Tuesday, December 7, 2010

More Boxing

What should make you suspicious of implicit boxing, well, the Equals methods should always ring alarm bells, it's inherited from the base Object class that everything derives from and if the Value Type you are using didn't provide a replacement that takes a type of the value type then an object will be needed.
Similarly with CompareTo, not derived from Object this time, but implemented from IComparable and it takes as a parameter? Anyone? yep, Object .

Strings lure unwary coders into boxing as well, the String.Concat method offers overloads that can easily cause implicit boxing, for example:

using System;
using System.Windows.Forms;

public class Program{
  public static void Main() {

    Int32 number = 5;
    MessageBox.Show("This message Box was brought to you with the help of the number " + number);
  }
}

Using ILDasm to view the IL code:

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       26 (0x1a)
  .maxstack  2
  .locals init (int32 V_0)
  IL_0000:  nop
  IL_0001:  ldc.i4.5
  IL_0002:  stloc.0
  IL_0003:  ldstr      "This message Box was brought to you with the help "
  + "of the number "
  IL_0008:  ldloc.0
  IL_0009:  box        [mscorlib]System.Int32
  IL_000e:  call       string [mscorlib]System.String::Concat(object,
                                                              object)
  IL_0013:  call       valuetype [System.Windows.Forms]System.Windows.Forms.DialogResult [System.Windows.Forms]System.Windows.Forms.MessageBox::Show(string)
  IL_0018:  pop
  IL_0019:  ret
} // end of method Program::Main

IL_009 shows the Int32 value type being boxed to allow it to be passed to the implicitly called String.Concat(object, object) on line IL_00e.

Boxing isn't difficult to avoid in most cases, for the above example a quick glance at MSDN will tell you that String.Concat offers a (String, String) overload, the first parameter I used was a string already, the boxing problem was caused by the Int32 and if I use the ToString  method available  on Int32 there will be no need to box. Use: MessageBox.Show("This message Box was brought to you with the help of the number " + number.ToString()); instead and the IL becomes:

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       27 (0x1b)
  .maxstack  2
  .locals init (int32 V_0)
  IL_0000:  nop
  IL_0001:  ldc.i4.5
  IL_0002:  stloc.0
  IL_0003:  ldstr      "This message Box was brought to you with the help "
  + "of the number "
  IL_0008:  ldloca.s   V_0
  IL_000a:  call       instance string [mscorlib]System.Int32::ToString()
  IL_000f:  call       string [mscorlib]System.String::Concat(string,
                                                              string)
  IL_0014:  call       valuetype [System.Windows.Forms]System.Windows.Forms.DialogResult [System.Windows.Forms]System.Windows.Forms.MessageBox::Show(string)
  IL_0019:  pop
  IL_001a:  ret
} // end of method Program::Main

I think for the next post I'll mumble a bit about more ways to avoid boxing.

Monday, November 29, 2010

Boxing - Bad News

Now then, where did I get to? It's so hard to remember what you said in an old post, ah yes boxing lets you pass value types as object pointers, but sadly there is some bad news.

The performance goodness of using a value type instance goes away when you box it, so if you try that code from the Types post, and then edit it to do some boxing like I suggested in the Good News post you may see some output like:

2147483647 value types took       00:01:00.1181261
2147483647 reference types took 00:01:00.0122760

Ouch! Why did that happen? Rememeber how a value type was good for performance because it didn't carry the overheads associated with instanciating a type on the managaed heap?? When you box a value type, that is exactly what you are doing, infact its slightly less performant because you had to create a value type as well.

So boxing can be evil, is it then, a case of - just make sure you don't box a value type if you don't have to and really really try to avoid boxing inside a loop.

Not quite, as I mentioned previously, boxing is implicit, so you may not even notice that you did it, for example, DateTime is a value type (public struct DateTime) amd it has a property DayOfWeek that returns an enum, also a value type (public enum DayOfWeek). System.Enum provides an override of the System.Object Equals method that will return true if its passed an enum of the same underlying type and value. It could be used like this:

using System;

public sealed class Program {
  public static void Main() {
    DateTime today = DateTime.Now;

    if( today.DayOfWeek.Equals(DayOfWeek.Monday )) {
      Console.WriteLine("It's Monday");
    }
    else {
      Console.WriteLine("It's not Monday");
    }
  }
}

Unsurprisingly, as I was mumbling about implicit boxing, there is some in the code sample above, the culprit is the Equals method call public override bool Equals(Object obj) which wants a parameter of reference type Object. I gave it an instance of a value type instead, so the CLR boxed it for me. How do I know? I checked the Intermediary Language (IL) produced by the compiler using ILDasm:

.method public hidebysig static void Main() cil managed
{
.entrypoint
// Code size 66 (0x42)
.maxstack 2
.locals init (valuetype [mscorlib]System.DateTime V_0,
bool V_1)
IL_0000: nop
IL_0001: call valuetype [mscorlib]System.DateTime [mscorlib]System.DateTime::get_Now()
IL_0006: stloc.0
IL_0007: ldloca.s V_0
IL_0009: call instance valuetype [mscorlib]System.DayOfWeek [mscorlib]System.DateTime::get_DayOfWeek()
IL_000e: box [mscorlib]System.DayOfWeek
IL_0013: ldc.i4.1
IL_0014: box [mscorlib]System.DayOfWeek
IL_0019: callvirt instance bool [mscorlib]System.Object::Equals(object)
IL_001e: ldc.i4.0
IL_001f: ceq
IL_0021: stloc.1
IL_0022: ldloc.1
IL_0023: brtrue.s IL_0034
IL_0025: nop
IL_0026: ldstr "It's Monday"
IL_002b: call void [mscorlib]System.Console::WriteLine(string)
IL_0030: nop
IL_0031: nop
IL_0032: br.s IL_0041
IL_0034: nop
IL_0035: ldstr "It's not Monday"
IL_003a: call void [mscorlib]System.Console::WriteLine(string)
IL_003f: nop
IL_0040: nop
IL_0041: ret
} // end of method Program::Main

"IL_0019: callvirt instance bool [mscorlib]System.Object::Equals(object)" passing in a this pointer.

Which shows two box operations, one to put DayOfWeek.Monday on the managed heap and a second caused by the override of Equals calling into the base types implementation

Boxing - Good News

In my previous post I mumbled about how value types were good for performance (of code – stop sniggering) but that isn’t the end of their story.

All value types derive from System.ValueType which derives from System.Object so anytime you have to use a method that only takes a parameter of object rather than the value type you want to pass it - just cast your value type to object. No problem.

Oh, hang on, there is a slight problem, an object variable is a pointer to the instance of a type on the managed heap and we've only got a value type on the thread stack - um.............. Fortunately the clever coders in Microsoft have solved the problem, anytime you need to use your value type instance as an object the Common Language Runtime (CLR) will wrap it into a reference type instance on the managed heap and give you a pointer to it. The process for doing this is what I like to call Boxing (If you don’t watch Miranda you didn’t get the joke).

The good news is boxing is implicit, all I need to do is add a line of code to my ValueType loop from the previous post that says:

object o = vt;

and the value type has been boxed and you have a pointer (o) to a  reference type instance on the managed heap. Boxing is great! The next post will be Boxing - Bad News where it is possible the level of greatness will reduce.

Reference and Value Types

.Net has two kinds of Types, value types and reference types, a value type is held on the thread stack, a reference type is held on the managed heap. A variable for a value type contains the fields for that instance of the type on the stack, a variable for a reference type contains a pointer to the memory location of the instance of the type on the heap. The Framework Class Library (FCL) has a large number of reference types and a small number of value types but most code will contain a higher usage of value type instances to reference type instances. A value type uses less memory (no Type object pointer, no sync block index), is faster to create (when you use a lot of them) and is not subject to, or can cause Garbage Collection.

And it's that last sentence that explains why value types are important, used properly they can improve performance, take a look at the following code example:

using System;
using System.Diagnostics;

class RefType { public Int32 x; }
struct ValueType { public Int32 x; }

public sealed class Program {
  public static void Main() {
    Int32 count = Int32.MaxValue;

    Stopwatch sw = Stopwatch.StartNew();
    for (Int32 i = 0; i < count; i++) {
      ValueType vt = new ValueType();
      vt.x = 7;
    }
    Console.WriteLine("{0} value types took     {1}", count.ToString(), sw.Elapsed.ToString());
   
    sw = Stopwatch.StartNew();
    for (Int32 i = 0; i < count; i++) {
      RefType rt = new RefType();
      rt.x = 7;
    }
    Console.WriteLine("{0} reference types took {1}", count.ToString(), sw.Elapsed.ToString());
  }
}

which produces:

2147483647 value types took     00:00:08.4344126
2147483647 reference types took 00:00:54.9586018

Instantiating lots of simple value types is much faster than instantiating lots of simple reference types - great!