Ode To Code

Thursday, March 5, 2009

C# vs. C++. In chase of performance

Hey, folks. I continue my previous theme. Today I’m going to talk about c# managed and c++ native environments. What performance and memory issues can appear? What’s "managed" mean here? What’s the conclusion?

Repeatedly I saw software engines based on .NET virtual machine with C++ staffs. Of course, if an engine is intended for .NET auditory it’s the good way to expose exuberant .NET API. However, the question that worried me was when it is time to migrate to a virtual machine when much performance-critical logic must be written. How this logic is slowed down when it is under managed environment.
Immediately I must say that there is another critical-performance niche where C++/C might be combined with assembler instructions and where obviously C# doesn’t dwell. I’m not familiar with this domain and I don’t score it here.

Test

I chose operations over a hierarchy tree. The test is synthetic. However, all of its parts (side by side element’s creation, up/down iteration and removing) can successfully exist in real logic. For both environments I chose standard solutions: C++ STL containers and streams and C# generics and IO. Both read the same operation sequences from files and calculated the same checksums.

My first results confused me. Surprisingly, C++ solution sucked. After some exploration I had found that C++ file stream was slower than C# one, because it possessed safe extraction from any sequence with special symbols. In C# I relied on some predefined special symbols sequence. So I replaced text files with binary files with predefined semantic. Also I emptied C# garbage collector to get the full element's removing equivalent.

All measurements were done side by side and alternately. So an idle/busy state of an operating system must influence to both applications equally.

Results

	Average spent time, sec	Resident memory, MB
C# (.NET2.0)	1,761	30,6
C++(VC8.0)	1,838	26,2

This situation is the nonsense. In any case C++ loses C#. Next endevour is to use Visual Studio 2003.

	Average spent time, sec	Resident memory, MB
C# (.NET2.0)	1,718	30,6
C++(VC7.1)	0,89	25,4

It's more likely. Searching Internet I've found next useful links:
http://social.msdn.microsoft.com/Forums/en-US/vcgeneral/thread/4431bab9-7c6e-4907-8e85-66ac8798dbc6/
http://www.jenkinssoftware.com/raknet/forum/index.php?topic=2091.0
As expected applying additional compiler options doesn't help:

	Average spent time, sec
C# (.NET2.0)	1,757
C++(VC8.0 + compiler options)	1,785

So, try to guess, what's the real reason of this C++ slowing down?

What's in Linux.

	Average spent time, sec	Resident memory, MB
C# (mono + .NET2.0)	2,93	51,8 (mono)
C++ (wine + VC7.1)	1,25	26,8 + 7,9 (wine) = 34,7
C++(g++)	1,45	13,4

Another nonsense. Bravo, ms guys! Your solution beats native one. You can if you want.

Conclusion

When performance can be the argument when you choose between C# and C++? In the term of common performance I think it's not the argument, even for engines. Only in solutions where logic routines (looped, real-time virtual processors ...) must out maximum performance C++ is the key, because it’s more flexible (for instance, libraries faster than standard STL exist – you can build your own framework) and managed environment features (type checking ...) must be avoided by definition.
In any case one private kludge can bring more performance issues than a programming environment at all.

Saturday, February 21, 2009

C# vs. C++

Today I’m going to bring the programming monsters together, compare its environment features and calculate their bout points. It’s impossible to cover all in one post, so here I’m going to talk about something that make up development philosophy, peculiar attributes in contrast to each other.

C++ strengths

1) friend

Friend declaration allows one class to get access to non-public members of another class. On the first look this feature is redundant and means nothing but object-oriented principles violation. However, I can figure four reasons where it’s important to have friends.

Non-public constructors/destructors. This case covers different creational patterns.

Non-public methods which however are not intended for internal class usage. This technique allows usage of non-public class interface only by special class, group of classes or proxies.

Lack of direct access to class data for some operation that a class doesn’t respond itself. The best example here is (de)serialization. Moreover, you can see that C# XmlSerializer undermines data encapsulation process completely.

Highly coupled by design classes as an alternative to nested classes.

The alternative to c++ friend is c# internal keyword, which is the part of friend assembly technique. However, I can’t say that it’s equivalent.

2) const

In c++ there are two intrinsic const goals, which were omitted in c#.

Passing arguments with const keyword to methods prevents its further modification in the methods.

Marking class methods with const keyword creates an assurance that class data are read only in the methods.

Of course, these techniques require additional workload, but it’s very powerful feature. Often a const keyword says more than enclosed method documentation.

3) Memory model

My pretence to c# is the excessive memory simplification. It becomes a problem when you need to pass structure (value-type) by reference, and vice versa, to pass class (reference-type) by value. The first case was solved by adding ref/out keywords (except the situation with a method’s returning value by reference – it’s impossible because a value lifetime strictly depends on a method’s scope. But there is another example where we need to return this structure by reference – it’s also impossible). Excepting previous remark, things look normal here. A value by default is an optimal approach and when you really need a reference, ref/out keywords appear.
More complicated situation we have with the second case. When you need to pass an object by value, you should clone it. ICloneable interface is intended for it. However, you can’t see any method’s declaration signs that an object is passed by value. It’s potential pitfall. Another interesting thing is that you can use ref/out keywords with reference types. It’s a rough equivalent to c++ reference to pointer. Next c# code snippet demonstrates it:


 public class A
 {
 }

 class Program
 {
  static void Method(A a)
  {
   a = new A();
  }
  static void Method(ref A a)
  {
   a = new A();
  }
  static void Main()
  {
   A a = null;
   Method(a);
   Console.WriteLine(a == null);//True
   Method(ref a);
   Console.WriteLine(a == null);//False
  }
 }

In c++ the concepts of pointer and reference with the operators arrow, dot, dereference and address-of solve all possible tasks in the generalized manner without any ad hoc solutions.
Another problem is the garbage collector. Yes, it frees you from memory managing, but however it also frees you from thinking about memory usage at all. new keyword in c++ is not a simple operator to create an object, it creates a heap object and the purpose of this creation is an obvious intention to keep up object’s lifetime.
Of course I cannot but mention about famous c++ memory leaks. Well, they are not such horrible as people say. Dangling pointers that don’t leave your code are solved easy. Problems can be with third-party APIs and foreign code where a solution becomes harder. However, I can’t drop this c++ point due to this reason.

4) header/source organization

Another pretty c++ thing is headers. I don’t mean a compilation/linkage model (here c++ is much more difficult than c#), but mean header as contents of a module and header + source as a complete module. Abstraction and modularity are the key design concepts and c++ makes the cut, c# approach that uses only source files and one source file is class declaration & definition doesn’t fascinate me.

5) cross-platform feature

In case when a project or its part can be potentially cross-platform or uses Linux, Mac as a primary environment, or uses third-party cross-platform libraries already written in c++, c++ becomes a good choice to get native cross-platform code.

C# strengths.

1) interface

Interface is the amazing thing that c# gives. I don’t mean the technical approach (by the way I think c++ with multiple inheritance is better here because it can include not only a behavior but private data also, which forms a completed reusable block), but mean the ideological approach – how framework classes reuses interfaces and how it increases abstraction on the whole. Learning an API becomes easier. Ideally, you learn a class description and interfaces that a class implements. The number of interfaces is much more less than classes in an API. The difference between inheritance and interface is that inheritance considerably helps to reuse code but unfortunately doesn’t help to create good abstraction. Interface, which is the evolved form of abstract class, makes a contract and each class that implements it must adhere to its contract. Moreover, interfaces go through class hierarchies.

2) Environment benefits.

.NET Framework code style naming convention. To tell the truth I’ve been using it everywhere.

Visual Studio maybe the best development environment even for cross-platform based c++ projects. But some features like code auto-formatting (ctrl-K-D), Refactor menu and so on belong to c# only.

Faster, easier and more careful compilation.

3) ASP + ADO

This collaboration forms the complete platform for web portals or any web server based solutions. It’s the stock phrase, which embodies web/data programming based on MS Windows platform + MS SQL Server.

4) WPF

GUI programming evolves very well. Since .NET 3.0 it aims to any desktop/web environment in generalized manner including animation, 2D/3D graphics, multimedia features.

5) WCF

Service-oriented features also evolve. Since .NET 3.0 Web Services and Remoting are generalized into one package.

Conclusion.

So for today what’s the reason to choose between C# and C++? If we talk about IT-related projects c# becomes the obvious choice. If we talk about programming projects we should look into C++ points 5 and also 3 and C# points from 3 through 5. Depending on project’s complexity both can be the choice. However the most important matter here is performance and machine resources cost. Actually I haven’t mentioned about c# managed environment vs. c++ native. It’s another discussion and it must be uncovered later.
Another programming niche is maintaining existing projects and here c++ is on demand. Migrating stuff doesn’t help.

Saturday, January 31, 2009

extern pitfalls in G++

Look at the example:
unit1.cpp


#include <iostream>
using namespace std;

int I = 2;
extern const int J = 10;
void FuncI();
void FuncJ();


int main()
{
 cout << hex;
 cout << "I, before " << I << " " << *(&I + 1) << endl;
 FuncI();
 cout << "I, after " << I << " " << *(&I + 1) << endl;
 FuncJ();
}

unit2.cpp


extern long long I;
extern int J;


void FuncI()
{
 I = I << 36;//causes contiguous memory damage
}


void FuncJ()
{
 J++;//causes segmentation fault
}

The program compiles fine and the results are:


I, before       2 0
I, after        0 20
Segmentation fault

Why g++ compiler is so spoiled here. I don't know. In Visual Studio it's impossible to do such nasty things.

Sunday, January 25, 2009

Singleton pattern C++. My recipe.

Look into example


#include <iostream>
#include "SingletonPattern.h"


class A : public ISingleton<A>
{
 //a few extra movements (friend declarations) is needed to create own non-public default ctor and dtor.
 A() { j = 10;};
 ~A() {std::cout << "~A" << std::endl;};
 friend class ISingleton<A>;
 friend class std::auto_ptr<A>;

public:
 //do not define any public ctors and dtor.
 //do not define copy ctor and assign operator at all.

 static int j;
 void Test(int i) { std::cout << "A::Test(" << i << ") " << ++j << std::endl;};
};
int A::j = 10;


class B : public ISingleton<B>
{
public:
 void Test(int i) { static int j = 0; std::cout << "B::Test(" << i << ") " << ++j << std::endl;};
};


class C : public ISingleton<C>
{
public:
 virtual void Test(int i) { static int j = 0; std::cout << "C::Test(" << i << ") " << ++j << std::endl;};
};


class D : public C
{
public:
 void Test(int i) { static int j = 0; std::cout << "D::Test(" << i << ") " << ++j << std::endl;};
};


int main()
{
 A::GetSingleton().Test(5);
 A& a = A::GetSingleton();
 a.Test(6);
// A anotherA = A::GetSingleton();//compilation error, ok
// a = a;//compilation error, ok

 B::GetSingleton().Test(3);
 B& b = B::GetSingleton();
 b.Test(100);
 C::GetSingleton().Test(33);
 D::GetSingleton().Test(23);//it doesn't work for D because ISingleton was defined for parent class C
 b.Test(15);
}


A::Test(5) 11
A::Test(6) 12
B::Test(3) 1
B::Test(100) 2
C::Test(33) 1
C::Test(23) 2
B::Test(15) 3
~A

There are 3 singleton classes (A, B, C). As you see it's very simple to create singleton class, only inherit public ISingleton<ThisClass>.
A few cautions however are written on lines 7, 14, 15. First belongs to own default ctor and dtor. Well, usually singleton class encapsulates heap objects, so own destructor is needed. Own default constructor maybe also useful, but if initialization with arguments is needed, custom Init method is the case because ctors with arguments are unavailable with this singleton implementation.

Ok, look at the implementation code.


#ifndef _SINGLETON_PATTERN_INTERFACES_SVOLKOV_
#define _SINGLETON_PATTERN_INTERFACES_SVOLKOV_


#include <memory>


template <class T>
class ISingleton
{
protected:
 ISingleton();
 ~ISingleton();
private:
 ISingleton(const ISingleton&);
 ISingleton& operator= (const ISingleton&);
 static std::auto_ptr<T> _ptr;
 friend class std::auto_ptr<T>;
public:
 ///Get singleton instance.
 static T& GetSingleton();
};


template <class T>
std::auto_ptr<T> ISingleton<T>::_ptr;


template <class T>
ISingleton<T>::ISingleton()
{
}


template <class T>
ISingleton<T>::~ISingleton()
{
}


template <class T>
T& ISingleton<T>::GetSingleton()
{
 if (!_ptr.get())
  _ptr = std::auto_ptr<T>(new T());
 return *(_ptr.get());
}


#endif

Lazy initialization is used to postpone instantiation process. Standard smart pointer is used to destroy object automatically when program is closing. This instantiation/deletion model is exactly what is usually needed for a singleton manager etc.
Please note that it's single-threaded pattern implementation, not multi-threaded.

Saturday, January 24, 2009

SIGSEGV on statement with std::string variable, G++.

If you have error report like


Program received signal SIGSEGV, Segmentation fault.
0xb7d2be69 in __gnu_cxx::__exchange_and_add () from /usr/lib/libstdc++.so.6
(gdb) where
#0  0xb7d2be69 in __gnu_cxx::__exchange_and_add () from /usr/lib/libstdc++.so.6
#1  0xb7d0dfcc in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string () from /usr/lib/libstdc++.so.6
...


Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb746b6d0 (LWP 9469)]
0xb7c5ce69 in __gnu_cxx::__exchange_and_add () from /usr/lib/libstdc++.so.6
(gdb) where
#0  0xb7c5ce69 in __gnu_cxx::__exchange_and_add () from /usr/lib/libstdc++.so.6
#1  0xb7c3e8e4 in std::string::assign () from /usr/lib/libstdc++.so.6
#2  0xb7c3e944 in std::string::operator= () from /usr/lib/libstdc++.so.6
...

or something similar,
check that memory under string variable isn't corrupted.
In my case the problem was with zeroed memory. In Visual Studio I used the ugly but effective construction like


string a;
memset(&a, 0, sizeof(a));

and I didn't have problems with it. But in g++ and in general I must be more accurate.

Friday, January 23, 2009

Method pointer G++

Look at this code:


#include <iostream>
using namespace std;


class A
{
public:
 void Test(int i) 
 {
  mc(&A::ma, i);
  mc(mb, i);
 };
protected:
 void ma(int i) { cout << "A::ma(" << i << ") has been called" << endl;};
 void mb(int i) { cout << "A::mb(" << i << ") has been called" << endl;};
 void mc(void (A::*m)(int), int i) { cout << "A::mc "; (this->*m)(i);};
};


int main()
{
 A a;
 a.Test(5);
}

In Visual Studio it compiles fine. But in g++ the error occurs:

 
FPointer.cpp: In member function ‘void A::Test(int)’:
FPointer.cpp:11: error: no matching function for call to ‘A::mc(<unresolved overloaded function type>, int&)’
FPointer.cpp:16: note: candidates are: void A::mc(void (A::*)(int), int)

That's because g++ compiler expects strict writing. Compare lines 10 and 11 to solve this error.

Delegation pattern in C++. My recipe.

Look into example code:


#include "DelegationPattern.h"
#include <iostream>
using namespace std;


class Notifier : public INotifier
{
public:
 void NotifyAll(int j)
 {
  std::vector<IDelegate*>::iterator i = _delegates.begin(), iEnd = _delegates.end();
  for (; i != iEnd; i++)
  {
   (*i)->Invoke(j++);
  }
 }
};


class A : ISubscriber<A>
{
public:
 void SubscribeAll(INotifier* notifier) { Subscribe(notifier, &A::ma); Subscribe(notifier, &A::mb);};
 void UnsubscribeAll(INotifier* notifier) { Unsubscribe(notifier, &A::ma); Unsubscribe(notifier, &A::mb);};
 void UnsubscribeSecond(INotifier* notifier) { Unsubscribe(notifier, &A::mb);};
 void SubscribeFirst(INotifier* notifier) { Subscribe(notifier, &A::ma);};
protected:
 void ma(int i) { cout << "A::ma(" << i << ") has been called" << endl;};
 void mb(int i) { cout << "A::mb(" << i << ") has been called" << endl;};
};

class B : ISubscriber<B>
{
 int _id;
public:
 B(int id) : _id(id) {}; 
 void SubscribeAll(INotifier* notifier) { Subscribe(notifier, &B::ma);};
 void UnsubscribeAll(INotifier* notifier) { Unsubscribe(notifier, &B::ma);};
 void ma(int i) { cout << "B(" << _id << ")::ma(" << i << ") has been called" << endl;};
};


int main()
{
 Notifier notifier;
 A a;
 
 {
  B b1(1);
  B b2(2);

  a.SubscribeAll(¬ifier);
  notifier.NotifyAll(3);
  b1.SubscribeAll(¬ifier);
  notifier.NotifyAll(10);
  a.SubscribeFirst(¬ifier);
  a.UnsubscribeSecond(¬ifier);
  a.UnsubscribeSecond(¬ifier);
  b2.SubscribeAll(¬ifier);
  notifier.NotifyAll(21);
 }
 notifier.NotifyAll(30);
}

Notifier is events source. It must be single instance and process all events.
Suppose you have 2 independent classes A and B, which want to receive some events from notifier. To achieve this goal they inherit ISubscriber interface and define inside themselves simple subscribe/unsubscribe methods. That's all. Program results are:


A::ma(3) has been called
A::mb(4) has been called
A::ma(10) has been called
A::mb(11) has been called
B(1)::ma(12) has been called
A::ma(21) has been called
B(1)::ma(22) has been called
B(2)::ma(23) has been called
A::ma(30) has been called

Here is the code that covers presented interfaces (DelegationPattern.h):


#ifndef _DELEGATION_PATTERN_INTERFACES_SVOLKOV_
#define _DELEGATION_PATTERN_INTERFACES_SVOLKOV_


#include <vector>
#include <algorithm>
class IDelegate;


///Simple notifier interface.
class INotifier
{
public:
 void Subscribe(IDelegate* delegate) { _delegates.push_back(delegate);};
 void Unsubscribe(IDelegate* delegate) { _delegates.erase(std::find(_delegates.begin(), _delegates.end(), delegate));};

protected:
 std::vector<IDelegate*> _delegates;
};


/** Subscriber interface.
@remarks 
Any class can become subscriber to register own methods as callbacks.
*/
template <class T>
class ISubscriber
{
public:
 ///Destructor.
 virtual ~ISubscriber() = 0;
 ///Accepted arguments list for target callback method.
 typedef void (T::*CallbackMethod) (int);

 /** Register new delegate.
 @remarks Actually any other arguments can be added here to distinguish event type to subscribe etc.
 */
 void Subscribe(INotifier* notifier, typename ISubscriber<T>::CallbackMethod method);
 /** Unregister delegate.
 */
 void Unsubscribe(INotifier* notifier, typename ISubscriber<T>::CallbackMethod method);

protected:
 std::vector<std::pair<INotifier*, IDelegate*> > _delegates;
};


/** Delegate object interface.
*/
class IDelegate
{
public:
 ///Call.
 virtual void Invoke(int) = 0;
};


/** Delegate object with callback method pointer.
*/
template <class T>
class MethodDelegate : public IDelegate
{
 template <typename U> friend class ISubscriber;
protected:
 T* _object;
 typename ISubscriber<T>::CallbackMethod _method;
public:
 ///Constructor.
 MethodDelegate(T* object, typename ISubscriber<T>::CallbackMethod method)
  : _object(object), _method(method) {};
 ///Call.
 void Invoke(int);
};


template<class T>
void MethodDelegate<T>::Invoke(int i)
{
 (_object->*_method)(i);
}


template <class T>
ISubscriber<T>::~ISubscriber<T>()
{
 std::vector<std::pair<INotifier*, IDelegate*> >::iterator i = _delegates.begin(), iEnd = _delegates.end();
 for (; i != iEnd; i++)
 {
  i->first->Unsubscribe(i->second);
  delete(i->second);
 }
}


template <class T>
void ISubscriber<T>::Subscribe(INotifier* notifier, typename ISubscriber<T>::CallbackMethod method)
{
 std::vector<std::pair<INotifier*, IDelegate*> >::iterator i = _delegates.begin(), iEnd = _delegates.end();
 for (; i != iEnd; i++)
 {
  if (i->first == notifier && ((MethodDelegate<T>*)i->second)->_method == method)
   return;//already subscribed
 }
 IDelegate* delegate = new MethodDelegate<T>((T*)this, method);
 _delegates.push_back(std::make_pair(notifier, delegate));
 notifier->Subscribe(delegate);
}


template <class T>
void ISubscriber<T>::Unsubscribe(INotifier* notifier, typename ISubscriber<T>::CallbackMethod method)
{
 std::vector<std::pair<INotifier*, IDelegate*> >::iterator i = _delegates.begin(), iEnd = _delegates.end();
 for (; i != iEnd; i++)
 {
  if (i->first == notifier && ((MethodDelegate<T>*)i->second)->_method == method)
  {
   i->first->Unsubscribe(i->second);
   delete i->second;
   _delegates.erase(i);
   return;
  }
 }
}


#endif

Make attention that it's only possible to define callback method with fixed arguments here (void method(int)), however it must be enough. In my program I used one (void method(object*)).
Also note that it's single-threaded solution. I'm going to post multi-threaded version later.