This blog is subject the DISCLAIMER below.

Sunday, January 14, 2007

Advanced C++ part 4 : Name Decoration and Intermixing C and C++ Code

If a file has two functions, add(int,int) and add(float,float), how would the linker distinguish them if some other file wanted the "add" function ?
The symbol name generated in this case should not be equal to "add", it might be __add_i_i or __add_f_f. It won't equal "add" even if there is only one add function defined, in cases there might be another one defined in another file.
The compiler is the one who choses the symbol name, the linker is the one who uses it blindly. That's because the linker doesn't have the knowledge about the internal typing system of a language, or whether a language supports overloading or not (some linkers understands C++, but that's out of our scope). This enables the linking of several languages into the same executable. For example linking pascal with C or C with C++.
The operation of converting the function/variable name to a symbol is knows is name decoration or name mangling. There is no standard defining this operation, each compiler can do it in the way he likes; implementation specific. Even different versions of the same compiler can do it differently. So it might be hard to link with old object files or files generated from other compilers (That's not a bug it's a feature). Shared libraries compiled with one version of GNU GCC might not link with programs compiled with another version. Name mangling in C in simpler, it just adds add and underscore '_' before the function name (or in some cases puts the function name as it is). Surprisingly, C does NOT support function overloading. At least not in the standard, to do it you have to modify the compiler to apply name decoration yourself.
Name mangling in C++ specific features, like namespaces and classes, can look more cryptic. For example:


void x(float z){}
void x(int i){}
int main ()
{
x(8.0f);
x(4);
}


would give the symbols (on GCC 4.1.2):


main
_Z1xf
_Z1xi


but in this example:


namespace MyNameSpace
{
void x(float z){}
class MyClass { public:
static void x(int i){}
};
}
int main ()
{
MyNameSpace::x(8.0f);
MyNameSpace::MyClass::x(4);
}


would give the symbols:


main
_ZN11MyNameSpace1xEf
_ZN11MyNameSpace7MyClass1xEi


If some other file wants to call a function from these, he has to put the exact mangled name like above in his object file so the linker can find it.

If you have some old libraries written in C and you are using C++ and want to call them you have to declare them like that:

c++ source file:

extern "C" void x(int i );


or

extern "C" {
void x(int i );
int y (float z):
int f;
}


Note that you can overload these function if you didn't mark your overloaded version as external C. The extern "C" directive marks that these functions are to mangled in C way, not the C++ way. So the linker can find the right function.
That's how you could call a C function form C++.
Calling a C++ function from C can be done in 2 ways. Marking your C++ function as external C.

For example:

extern "C" void print() {
cin >> x;
cout << "this is C++ code"; }


This method can be applied only on global function, i.e. functions not members of any namespace or class.
The second method may allow you to be able to call functions inside a namespace or a static member function, but it is a manual way.
We have the function


_ZN11MyNameSpace1xEf


You will name it the same way in the C source, taking into count how will the C compiler mangle it further. Perhaps this method can make you call non-static member functions, but it requires you to send the this pointer yourself.

Next post will be about calling convention and intermixing C and assembly code.

Further reading:
Google "name decoration"

7 comments:

Ahmed M. Farrag said...

prrrrrrrrreeeeeeetttttttyyyy interesting...
we want more
we want more
we want more
:)

Mohammad Alaggan said...

Thanks for the encouraging:)

Mohamed Gamal El-Din said...

Great Posts ,Nabil
i have a request ,, ,,, i want a special post to speak about how to increase the C++ program performance
mmm like which is better Conest or #define ??? this was a small example to explain what i want to say
& another Request .. i want a comparison between C++ ,C# and Java :D ( i think it will be very useful )

Ahmed M. Farrag said...

well, const would save space and reduce the resultant program size, OTOH #define would make it faster cuz it doesnt have to do a mem reference to get the value of that const
i guess....

Mohammad Alaggan said...

M.Gamal:
Thanks
I write what you can't find easily from one source and stuff you didn't hear about before. What you say you know what to search for at least. Try "Effective C++" book. About the performance, we will talk later isA about Meta-programming.

Mohammadeen:
Not necessarily, read the above book for explanation.

Mohamed Gamal El-Din said...
This comment has been removed by the author.
Mohamed Gamal El-Din said...

Thanks alot ya nabil, and isA i will check this book.