»
S
I
D
E
B
A
R
«
.NET performance and your hardware
April 14th, 2011 by evereq

Was inspired by Ayende post (see http://ayende.com/Blog/archive/2011/04/14/performance-numbers-in-the-pub.aspx) and want to check how much objects my work PC may create :D
Moreover, now seems best time to do that – my company order today new SUPER machines for our team, so it will be interesting to see difference with current PC using such “.NET” test.

My current PC is Intel Q8200 4 Cores / 8Gb memory / SSD Intel 80Gb etc – you get idea :)

Main difference between my code and Ayende code (as well as many blog comments code) is that I create objects using multiple threads. What is interesting for me that most of people even today, just forget the fact that they have multiple cores in PC (or Server) and modern development should be done with that “stick” idea in your mind: “multitask / multithreading / async may be used if you want to utilize optimally your hardware”. Sure it’s not always make sense to go that “hard way” – some tasks just not scale too much with that approach. At the same time, in that specific tests I was able to get about 50% performance gain using very simple threading model compared to the single threaded. Not bad enough, right?

Sure it’s not really optimal strategy (more optimal for sure to just use C / C++) and I think memory speed restrict CPU power in that specific tests (i.e. no matter how many cores you use in CPU, if your memory / memory controller just can’t get / process all that in the parallel).  In addition I also initialize each created object integer field with some value to restrict some possible optimizations in CLI.  You can also found multiple small optimizations to avoid some overhead etc. Sure it’s not best code out there, but effective enough to demonstrate how TPL can improve results even in such a simple task. Maybe later will found time and write same using C++ with Parallel Patterns Library (PPL) to measure the difference :D

On my PC test create 4 new threads and each thread create multiple objects. Note that test works 10 seconds to give you more precise results :)

Tests results: around 150M created and initialized objects per second (don’t forget to build in Release mode if you really want to run it)
I promise to publish new results when new PC arrive to our office;-)

Btw, if you get your own results, will be glad to see and compare :D

Stay tuned :)

P.S. Code is not finally optimized and is not “production ready”. Means that I just want to play a bit with what Ayende originally wrote…
For example, you probably may note that test will take longer than 10 seconds to run if your PC already busy with some tasks (simply because I use too much threads for such environment) etc.

P.P.S. No warranties :) Please do not run that code on your production servers :D

Update #1: My colleague have notebook with Intel i5 CPU (4 Cores) and he just get  215M result :D

Update #2: Just get new PC with CPU Intel i7-2600K (at 3.5Ghz) / 8Gb DDR3 1600Mhz memory etc. I get just amazing result: 700M with same source code :D WOW!

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
using System.Timers;

namespace Testing
{
    class Program
    {
        static volatile bool _isContinue = true;

        static long _total;

        const int TestTimeSec = 10000;

        static Stopwatch _sp;

        static void Main(string[] args)
        {

            var tasks = new List<Task>();

            int processors = Environment.ProcessorCount;

            Console.WriteLine("Detected {0} cores"processors);

            var timer = new System.Timers.Timer(TestTimeSec);
            timer.Elapsed += TimerElapsed;
            timer.AutoReset = false;
            timer.Start();

            _sp = Stopwatch.StartNew();

            for (int t = 0t < processorst++)
            {
                tasks.Add(Task.Factory.StartNew(
                    () =>
                    {
                        long i = 0;

                        while (_isContinue)
                        {
                            var obj = new MyClass();
                            i++;
                            obj.B = i;
                        }

                        Interlocked.Add(ref _totali);
                    }
                ));
            }

            // let's complete all the tasks to get results into _total
            Task.WaitAll(tasks.ToArray());

            _sp.Stop();

            Console.WriteLine("Created {0} objects in {1}"_total / 10_sp.Elapsed.TotalMilliseconds / 10);
            Console.ReadKey();

        }

        static void TimerElapsed(object sender, ElapsedEventArgs e)
        {
            _isContinue = false;
            Console.WriteLine("Stopping...");
        }

        public class MyClass
        {
            public string A;
            public long B;
            public DateTime C;
        }

    }
}

  • http://www.zachburlingame.com Zach Burlingame

    Good stuff! My initial results were ~476M objects:

    Detected 8 cores
    Stopping…
    Created 475965526 objects in 1001.5932

    Unfortunately, you are still suffering from one of issues I pointed out with Ayende’s version.

    First, some background on how I ran these tests so that others may compare more easily.
    - I built the project in the Release x86 configuration
    - I’m running Windows 7 64-bit SP1 and .NET v4.0.30319
    - I used a console application and targeted the .NET Framework 4 Client Profile
    - I ran my tests detached from the debugger (Ctrl-F5) which increased the number of objects created per second by 170%
    - My machine has 12GB of DDR3 and an Intel Xeon E5620 @ 2.4GHz
    - I have 4 logical cores and HT is enabled, so my processor count was 8

    Observations:

    - Your “MyClass” contains an integral value type (int) and two reference types (string and Datetime). Datetime is itself an object reference type which may contain other value and reference types. This means you are actually creating more total CLR objects than you think (since MyClass is a compound object and that is all we are counting).

    - Changing the definition of MyClass substantially changes the numbers substantially (number of objs created as percent of original):
    - string/long/Datetime (Created 476582824 objects in 1000) – 100%
    - string/long (Created 628159585 objects in 999) – 132%
    - long (Created 708269097 objects in 1000) – 148%

  • http://www.zachburlingame.com Zach Burlingame

    Apparently the stopwatch, assignments, and while loop conditional are a decent amount of overhead as well. Before, using just a long in the MyClass, I was getting ~708M obj/s (see my previous comment). Now I’m getting almost 1B obj/s:

    Detected 8 cores
    Created 924214000 in 1 sec

    Here is the updated code.

    using System;
    using System.Collections.Generic;
    using System.Diagnostics;
    using System.Threading;
    using System.Threading.Tasks;

    namespace Testing
    {
    class Program
    {
    private static long _totalObjectsCreated = 0;
    private static long _totalElapsedTime = 0;

    static void Main(string[] args)
    {
    var tasks = new List();
    int processorCount = Environment.ProcessorCount;

    Console.WriteLine(“Detected {0} cores”, processorCount);

    for (int t = 0; t
    {
    int reps = 1000000000;
    Stopwatch sp = Stopwatch.StartNew();
    for (int j = 0; j < reps; ++j)
    {
    new object();
    }
    sp.Stop();

    Interlocked.Add(ref _totalObjectsCreated, reps);
    Interlocked.Add(ref _totalElapsedTime, sp.ElapsedMilliseconds);
    }
    ));
    }

    // let's complete all the tasks to get results into _totalObjectsCreated
    Task.WaitAll(tasks.ToArray());

    Console.WriteLine("Created {0} in 1 sec", (_totalObjectsCreated / (_totalElapsedTime / processorCount)) * 1000);
    Console.ReadKey();
    }
    }
    }

  • http://www.zachburlingame.com Zach Burlingame

    Update: I was incorrect about DateTime being a compound object. It’s actually a value type structure. It is however, still slower to have MyClass use the two value types (long/Datetime) and the one reference type (string) than just long. For my attempts to purely test the “How many CLR objects can you create in 1 second”, I replaced new MyClass() with new object() to avoid the overhead associated with the other types.

»  Substance: WordPress   »  Style: Ahren Ahimsa
© Copyright 2008–2010 EvereQ.com All rights reserved. Logos, company names used here if any are only for reference purposes and they may be respective owners right or trademarks.